Lecture 7 Econ 2001 2015 August 18
Lecture 7 Outline First, the theorem of the maximum, an amazing result about continuity in optimization problems. Then, we start linear algebra, mostly looking at familiar definitions. 1 Theorem of the Maximum 2 Matrices 3 Matrix Algebra 4 Inverse of a Matrix 5 System of Linear Equations 6 Span and Basis
Berge s Theorem Also called Theorem of the Maximum, Berge s theorem provides conditions under which in a constrained maximization problem the maximum value and the maximizing vectors are continuous with respect to the parameters. Theorem Let A R m and X R n be both non-empty. Assume f : A X R is a continuous function and that ϕ : A 2 X is a continuous correspondence that is compact and non-empty for all a A. For all a A define h(a) = max f (a, x) and µ(a) = {x ϕ(a) : h(a) = f (a, x)}. x ϕ(a) Then µ(a) is non-empty for all a A, µ(a) is upper hemicontinuous, and h(a) is continuous. REMARK If µ is a function, then it is a continuous function since upper hemicontinuous functions are continuous.
µ(a) is non-empty for all a A Proof. By assumption, ϕ(a) is compact and non-empty for all a A, and f is continuous. Since a continuous function on a compact set attains its maximum (extreme value theorem), µ(a) is non-empty for all a.
Prove that µ is upper hemicontinuous by contradiction. Proof. Since ϕ is uhc and ϕ(a) is closed for all a, ϕ has closed graph (see theorem from last class). If µ is not uhc, there are a sequence {a n } in A converging to a A and an ε > 0 such that, for every n, there is a point x n µ(a n ) that does not belong to B ε (µ(a)). The contradiction is to find a subsequence x nk that converges to a point x µ(a). Since ϕ is uhc, there is a δ > 0 such that if a belongs to B δ (a), then ϕ(a) B ε (ϕ(a)). Since ϕ is compact, the set C = {x R n : x y ε for some y ϕ(a)} is compact. Since lim n a n = a, there is a positive integer N such that a n B δ (a) when n N. Therefore x n µ(a n) ϕ(a n) C for n N, hence x n belongs to the compact set C, for n N. By Bolzano-Weierstrass, a subsequence x nk that converges to some x X. Call this subsequence x n again. Since ϕ has closed graph, we know that x ϕ(a). So, lim n a n = a A, x n µ(a n) for all n, and lim n x n = x ϕ(a). Next, show that x µ(a). Since x ϕ(a) we know that f (a, x ) max x ϕ(a) f (a, x) = h(a). If we prove that f (a, x ) = h(a) we are done (why?). Suppose that f (a, x ) < h(a). Then there exists ˆx ϕ(a) such that f (a, ˆx) > f (a, x ). Because ϕ is lhc, for each n, there is a ˆx n ϕ(a n) such that lim n ˆx n = ˆx. Since f is continuous, lim n f (a n, ˆx n) = f (a, ˆx) > f (a, x ). Similarly lim n f (a n, x n) = f (a, x ). Therefore there is a positive integer K such that for n K, f (a n, ˆx n) > f (a n, x n) which is impossible because ˆx n ϕ(a n) and f (a n, x n) = max x ϕ(an) f (a n, x). A contradiction. This contradiction proves that f (a, x ) = h(a) and so x µ(a). This completes the proof that µ is upper hemicontinuous.
Proof that h is continuous. Proof. Let {a n } be a sequence in A converging to â A. We must show that h(a n ) converges to h(â). For each n, let x n µ(a n ) and let ˆx µ(â), so that h(a n ) = f (a n, x n ) for all n, and h(â) = f (â, ˆx). Suppose not: the sequence h(a n ) does not converge to h(â). Then ε > 0 and a subsequence n k such that h(a nk ) h(â) ε for all k. Hence f (a nk, x nk ) f (â, ˆx) ε for all k. ( ) Since µ is uhc and the sets µ(a) are closed for all a, we know that µ has closed graph. By uhc of µ, there is a γ > 0 such that if a B γ(â), then µ(a) B 1 (µ(â)); hence µ(a) is contained in the compact set G = {x X : x y 1 for some y µ(â)}. Since lim k a nk = â, we may assume that a nk B γ(â) for all k, and hence x nk µ(a nk ) G for all k. Since G is compact, Bolzano-Weierstrass implies there is a subsequence of x nk, call it x nk again, such that lim k x nk = x, for some x B. Since lim k a nk = â and lim k x nk = x and, for all k, x nk µ(a nk ) and since µ has closed graph, it follows that x µ(â). Therefore f (â, x ) = h(â) = f (â, ˆx). ( ) and the continuity of f imply that ε lim k f (a nk, x nk ) f (â, ˆx) = f (â, x ) f (â, ˆx) thus contradicting that h(a n) does not converge to h(â). This contradiction proves that lim lim n h(a n ) = h(â) and so h is continuous.
Theorem Berge s Theorem Let A R m and X R n be both non-empty. Assume f : A X R is a continuous function and that ϕ : A 2 X is a continuous correspondence that is compact and non-empty for all a A. For all a A define h(a) = max f (a, x) and µ(a) = {x ϕ(a) : h(a) = f (a, x)}. x ϕ(a) Then µ(a) is non-empty for all a A, µ(a) is upper hemicontinuous, and h(a) is continuous. This is an amazing result When solving a constrained optimization problem, if then the objective function is continuous, and the correspondence defining the constraint set is continuous, compact, and non empty; the problem has a solution the optimized function is continuous in the parameters the correspondence defining the optimal choice set is upper hemi continuous if this is a function, it is a continuous function.
Matrices Definition An m n matrix is an element of M m n (the set of all m n matrices) and written as α 11 α 12 α 1n α 21 α 22 α 2n A =... = [α ij ] α m1 α m2 α mn where m denotes the number of rows and n denotes the number of columns. An m n matrix is just of a collection of nm numbers organized in a particular way. We can think of a matrix as an element of R m n if all entries are real numbers. The extra notation makes it possible to distinguish the way that the numbers are organized.
Vectors Example ( 0 1 5 A = 2 3 6 0 2 ) Notation Vectors are a special case of matrices: x 1 x 2 x =. Mn 1 x n This notation emphasizes that we think of a vector with n components as a matrix with n rows and 1 column.
Definition Transpose of a Matrix The transpose of a matrix A, is denoted A t. To get the transpose of a matrix, we let the first row of the original matrix become the first column of the new (transposed) matrix. A t = α 11 α 21 α 1n α 12 α 22 α 2n... α 1m α 2m α nm Clearly, if A M m n, then A t M n m. Definition A matrix A is symmetric if A = A t. = [α ji ] Example Continuing the previous example, we see that A t 3 2 = 0 6 1 0 5 2
Matrix Algebra: Addition Definition (Matrix Addition) If α 11 α 12 α 1n α 21 α 22 α 2n A = m n... = [α ij ] and B α m1 α m2 α mn m n = β 11 β 12 β 1n β 21 β 22 β 2n... β m1 β n2 β mn = [β ij ] then A + B = D }{{} m n m n = α 11 + β 11 α 12 + β 12 α 1n + β 1n α 21 + β 21 α 22 + β 22 α 2n + β 2n... α m1 + β m1 α m2 + β m2 α mn + β mn = [δ ij ] = [α ij + β ij ]
Matrix Algebra: Multiplication Definition (Matrix Multiplication) If A and B are given, then we define m k k n such that A B = C = [c ij ] m k k n m n c ij k a il b lj l=1 Note that the only index being summed over is l.
Matrix Algebra: Multiplication A B = C = [c ij] where c ij k m k k n m n l=1 a il b lj Example Let ( 0 1 5 A = 2 3 6 0 2 ) and B = 3 2 0 3 1 0 2 3 Then ( ) 0 3 0 1 5 A B = 1 0 2 3 3 2 6 0 2 }{{} 2 3 2 2 ( (0 0) + (1 1) + (5 2), (0 3) + (1 0) + (5 3) = (6 0) + (0 1) + (2 2), (6 3) + (0 0) + (2 3) ( ) 11 15 = 4 24 )
Matrix Algebra: Multiplication REMARK In general: A B B A Example ( 0 1 5 As in the previous example A = 2 3 6 0 2 A B B A 2 3 3 4 3 4 The product on the right is not defined! ) 2 3 and B = 3 2 0 3 1 0 2 3
Matrix Algebra: Square and Identity Matrices Definition Any matrix that has the same number of rows as columns is known as a square matrix, and denoted A n n. Definition The identity matrix is denoted I n and is equal to 1 0... 0 0 1... 0 I n = n n..... 0... 0 1. REMARK Any matrix multiplied by the identity martix gives back the original matrix: for any matrix A : m n A I n = A and I m A = A m n m n m n m n
Diagonal and Triangular Matrices Definition A square matrix is called a diagonal matrix if a ij = 0 whenever i j. Definition A square matrix is called an upper triangular matrix (resp. lower triangular) if a ij = 0 whenever i > j (resp. i < j). Diagonal matrices are easy to deal with. Triangular matrixes are also tractable. In many applications you can replace an arbitrary square matrix with a related diagonal matrix (super useful property in macro and metrics). We will prove this.
Matrix Inversion Definition We say a matrix A n n is invertible or non-singular if B n n such that A B = B A = I n n n n n n n n n }{{}}{{} n n n n If A is invertible, we denote its inverse as A 1. So we get A (n n) (n n) A 1 = A 1 A = I n n n n n }{{}}{{} n n n n A square matrix that is not invertible is called singular.
Determinant of a Matrix Definition The determinant of a matrix A (written det A = A ) is defined inductively as follows. For n = 1 A (1 1) det A = A a 11 For n 2 A (n n) det A = A a 11 A 11 a 12 A 12 + a 13 A 13 ± a 1n A 1n where A 1j is the (n 1) (n 1) matrix formed by deleting the first row and jth column of A. Note The determinant is useful primarily because a matrix is invertible if and only if its determinant 0.
Determinant of a Matrix det A = A a 11 A 11 a 12 A 12 + a 13 A 13 ± a 1n A 1n where A 1j is the (n 1) (n 1) matrix formed by deleting the first row and jth column of A. Example If ( A = [a a11 a ij ] = 12 2 2 a 21 a 22 ) det A = a 11 a 22 a 12 a 21 Example If A 3 3 = [a ij ] = det A = a 11 a 22 a 23 a 32 a 33 a 11 a 12 a 13 a 21 a 22 a 23 a 31 a 32 a 33 a 12 a 21 a 23 a 31 a 33 + a 13 a 21 a 22 a 31 a 32
Adjoint and Inverse Definition The adjoint of a matrix A n n is the n n matrix with entry ij equal to adj A = ( 1) i+j det A ji Example If A is a (2 2) matrix then adja = FACT ( ) a22 a 12 a 21 a 11 The Inverse of a matrix A n n is given by A 1 = 1 det A adj A Example If A is a (2 2) matrix and invertible then A 1 = 1 det A ( ) a22 a 12 a 21 a 11
Inner Product Definition If x, y M n 1, then the inner product (or dot product or scalar product) is given by n x t y = x 1 y 1 + x 2 y 2 + + x n y n = x i y i i=1 Note that x t y = y t x. Notation We usually write the inner product x y and read it x dot y to mean: n x y = x i y i i=1
Inner Product, Distance, and Norm Remember the Euclidean Distance is where z = d(x, y) = x y z 2 1 + z2 2 + + z2 n = n Under the Euclidean metric, the distance between two points is the length of the line segment connecting the points. In this case z, the distance between 0 and z, is the norm of z. i=1 z 2 i FACT The Norm is an Inner Product: z 2 = z z.
Definition Orthogonality We say that x and y are orthogonal (at right angles, perpendicular) if and only if their inner product is zero. Two vectors are orthogonal whenever x y = 0. This follows from The Law of Cosines. : if a triangle has sides A,B, and C and the angle θ is opposite the side C, then c 2 = a 2 + b 2 2ab cos(θ), where a, b, and c are the lengths of A, B, and C respectively. Take a and b to be the vectors x and y, θ is the angle between x and y, and notice that c = a b = x y Then: law of cosines {}}{ (x y) (x y) = x x + y y 2(x y) = x x + y y 2 x y cos(θ), Hence x y = x y cos(θ), and x y x y = cos(θ) 1 The inner product of two non-zero vectors is zero if and only if the cosine of the angle between them is zero (cosine=0 means they are perpendicular). 2 Since the absolute value of the cosine is less than or equal to one, x y x y
Systems of Linear Equations Consider the system of m linear equations in n variables: y 1 = α 11 x 1 + α 12 x 2 + + α 1n x n y 2 = α 21 x 1 + α 22 x 2 + + α 2n x n. y i = α i1 x 1 + α i2 x 2 + + α in x n. y m = α m1 x 1 + α m2 x 2 + + α mn x n where the variables are the x j. This can be written using matrix notation
Matrices and Systems of Linear Equations Notation A system of m linear equations in n variables can be written as where y = (m 1) thus y 1 y 2. y m y 1 y 2. y m = x = (n 1) y (m 1) x 1 x 2. x n = A x (m n) (n 1) } {{ } (m 1) A = m n α 11 α 12 α 1n α 21 α 22 α 2n... α m1 α m2 α mn α 11 α 12 α 1n α 21 α 22 α 2n... α m1 α m2 α mn x 1 x 2. x n = [α ij ]
Facts about solutions to linear equations Definition A system of equations of the form Ax = 0 is called homogeneous. A homogeneous system always has a solution (x = 0). This solution is not unique if there are more unknowns than equations, or if there are as many equations as unknowns and A is singular. Theorem When A is square, the system Ax = y has a unique solution if and only if A is nonsingular. If defined, the solution is x = A 1 y. If not, then there is a nonzero z such that Az = 0. If you can find one solution to Ax = y, you can find infinitely many.
Matrices as Linear Functions REMARK: Matrices as functions A matrix is a function that maps vectors into vectors. This function applies a linear transformation to the vector x to get another vector y: α 11 α 12 α 1n α 21 α 22 α 2n x 1 y 1 A x x 2 = y or (m n) (n 1) (m 1)... }{{} α m1 α m2 α mn. = y 2. (m 1) x n y m
Linear Independence R n is a vector space, so sums and scalar multiples of elements of R n are also elements of it. Given some vectors in R n, their linear combinations is also in R n. Definition Let X be a vector space over R. A linear combination of x 1,..., x n X is a vector of the form n y = α i x i where α 1,..., α n R i=1 α i is the coeffi cient of x i in the linear combination. Definition A collection of vectors {x 1,..., x k }, where each x i X (a vector space over R), is linearly independent if k λ i x i = 0 if and only if λ i = 0 for all i. i=1 The collection {x 1,..., x k } X is linearly independent if and only if n i=1 λ i x i = 0 x i X i λ i = 0 i
Span and Dimension The span of a collection of vectors is the set of all their linear combinations. Definition If {v 1,..., v k } = V X, the span of V is the set of all linear combinations of k elements of V : spanv = {y X : y = λ i v i with v V } Fact span(v ) is the smallest vector space containing all of the vectors in V. Definition A set V X spans X if spanv = X. Definition The dimension of a vector space V is the smallest number of vectors that span V. Example R n has dimension n. You can span all of it with only n vectors. i=1
Span and Linear Independence Theorem If X = {x 1,..., x k } is a linearly independent collection of vectors in R n and z span(x ), then there are unique λ 1,..., λ k such that z = k i=1 λ i x i. Proof. Existence follows from the definition of span. Take two linear combinations of the elements of X that yield z so that k k z = λ i x i and z = λ i x i. i=1 i=1 Subtract one equation fron the other to obtain: k ( ) z z = 0 = λ i λ i xi. By linear independence, λ i λ i = 0 for all i, as desired. i=1
(Hamel) Basis Definition A basis for a vector space V is given by a linearly independent set of vectors in V that spans V. Remark A basis must satisfy two conditions 1 Linearly independent 2 Spans the vector space. Example {(1, 0), (0, 1)} is a basis for R 2 (this is the standard basis). Problem Set
Basis Example {(1, 1), ( 1, 1)} is another basis for R 2 : Let (x, y) = α(1, 1) + β( 1, 1) for some α, β R Therefore x = α β and y = α + β x + y = 2α α = x + y 2 y x = 2β β = y x 2 (x, y) = x + y (1, 1) + y x ( 1, 1) 2 2 Since (x, y) is an arbitrary element of R 2, {(1, 1), ( 1, 1)} spans R 2. if (x, y) = (0, 0) then α = 0 + 0 = 0, and β = 0 0 = 0 2 2 so the coeffi cients are all zero, so {(1, 1), ( 1, 1)} is linearly independent. Since it is linearly independent and spans R 2, it is a basis.
Basis Example {(1, 0), (0, 1), (1, 1)} is not a basis for R 2 (why?). 1(1, 0) + 1(0, 1) + ( 1)(1, 1) = (0, 0) so the set is not linearly independent. Example {(1, 0, 0), (0, 1, 0)} is not a basis of R 3, because it does not span R 3 (why?).
Span, Basis, and Linear Independence Theorem If X = {x 1,..., x k } is a linearly independent collection of vectors in R n and z span(x ), then there are unique λ 1,..., λ k such that z = k i=1 λ i x i. Put together this statement and the fact that R n has dimension n. Take X = V a basis for R n. Then span(x ) = span(v ) = R n. Therefore any vector in R n can be written as k i=1 λ i v i.
Tomorrow More linear algebra. 1 Eigenvectors and eigenvalues 2 Diagonalization 3 Quadratic Forms 4 Definiteness of Quadratic Forms 5 Uniqure representation of vectors