(Homework 1: Chapter 1: Exercises 1-7, 9, 11, 19, due Monday June 11th See also the course website for lectures, assignments, etc) Note: today s lecture is primarily about definitions Lots of definitions We will use these in the next sections 1 Vectors in R n When working with functions of several variables, we consider the input as a n-vector, where n refers to the dimension of the space we are working (n = the number of variables) Thus x R n is defined as x 1 x 2 x = = (x 1, x 2,, x n ) x n Given two or more vectors, we define the following vector space operations: a) Addition: x + y = (x 1 + y 1, x 2 + y 2,, x n + y n ) b) Scalar multiplication: αx = (αx 1, αx 2,, αx n ) c) Dot product: x y = x 1 y 1 + x 2 y 2 + + x n y n (Also known as the inner product) d) Norm: x = x 2 1 + x 2 2 + + x 2 n = (x x) 1/2 (The length of the vector) The inner product and norm operations have several important properties Defn Properties of the inner product, given two vectors x and y and a constant α: x y = y x (x + y)z = xz + yz (αx)y = α(xy) Defn Properties of the norm, given two vectors x and y and a constant α: x 0 and x = 0 if and only if x = 0 αx = α x x + y x + y (Known as the triangle inequality The length of the longest side is never longer than the sum of the lengths of the shorter sides) x y x y (Known as the Cauchy-Schwarz 1 inequality) 1 And sometimes known as the Cauchy-Schwarz-Buniakowsky inequality p 1 of 5
We also need to define the distance between two vectors In R, the difference between two variables is x 1 x 2 In R 2 we use the Euclidean distance between a = (x a, y a ) and b = (x b, y b ), or φ(a, b) = (x a x b ) 2 + (y a y b ) 2 We can define the distance for any positive dimension n: Defn The distance between two vectors x and y in R n is defined as ( n ) 1/2 φ(x, y) = x y = (x i y i ) 2 Lastly we need the definition of an n-dimensional ball to define open and closed sets: define the ball i=1 B(x, r) = {y R n ; φ(x, y) < r} (This is an open ball; the boundary is not included) Given a set D R n, we define x as an interior point if there is some radius r > 0 such that B(x, r) D (In other words, we can draw a ball very small if necessary around x so that the entire ball is contained inside D Points on the boundary of D, for instance, can never be interior points) A set D R n is open if it is equal to its own interior D 0, where D 0 is the set of all interior points of D Ex Examples of open sets: (0, 1) R is open {(x, y); x 2 + y 2 < 1} R 2 is open (and is the unit circle centered at 0) A set D R n is closed if its complement D c is open Ex Examples of closed sets: [0, 1] R is closed {(x, y); x 2 + y 2 1} R 2 is closed (and is the unit circle centered at 0, but now includes the boundary) Typically open sets involve strict inequalities, and closed sets involve or (Odd as it sounds, R n is both open and closed, since it both contains all its interior points, and its complement is the empty set which is technically also open This is the only n-dimensional set which is both open and closed) 2 Functions of Several Variables Now that we have defined vectors, we can move on to functions that take a vector as an input Let f : D R n R n be a function Then x D is a p 2 of 5
a) global minimizer of f on D if f(x ) f(x) for all x D b) strict global minimizer of f on D if f(x ) < f(x) for all x D and x x c) local minimizer of f on D if f(x ) f(x) for x B(x, r) for some r > 0 d) strict local minimizer of f on D if f(x ) < f(x) for x B(x, r) for some r > 0, and x x e) x is a critical point if f x i (x ) exists and is equal to 0 for i = 1, 2,, n (Note that the first four conditions are essentially identical to the definitions for a single variable function Condition e) is comparable, but now we require that all the partial derivatives be zero) Theorem (Fermat s Theorem 2 ) If f is differentiable on D R n, and x D 0, and x is a local minimizer or maximizer of f, then x also has to be a critical point of f (This is important because it tells us that every minimum or maximum is a critical point, rather than the converse that every critical point can be a minimum or maximum So if we find all the critical points, we are guaranteed to have all the minimizers and maximizers, and possibly a few extra points) Note that (( ) ( ) ( )) f f f f(x ) =,, x 1 x 2 x n and x is a critical point if and only if f(x ) = 0 ( f(x is the gradient of f at x) 21 Returning to the Taylor Expansion Consider the Taylor Theorem for 1 or more dimensions n = 1 f(x) = f(x ) + f (x )(x x ) + 1 2 f (z)(x x ) 2 for some z between x, x n > 1 f(x) = f(x ) + f (x ) (x x ) + 1 2 (x x ) H f (z)(x x ) where H f (z) is the Hessian n n matrix defined so that 2 One of the smaller ones H f (z) i,j = 2 f x i x j p 3 of 5
More specifically, the Hessian looks like 2 f x 2 1 2 f H f (x) = x 2 x 1 2 f x n x 1 x 1 x 2 x 1 x n x 2 2 x 2 x n x n x 2 x 2 n Note that H f (z) is a symmetric matrix since = x j x i x i x j 22 Notes on Linear Algebra Let A be an n n symmetric matrix A quadratic form Q A (y) : R n R n is defined by Q A (y) = y (Ay) = n n a ij y i y j i=1 j=1 Ex Consider the function f(x, y, z) = x 2 y 2 + 4z 2 2xy + 4yz We calculate the gradient as ( f f(x, y, z) = x, f y, f ) z = (2x 2y, 2y 2x + 4z, 8z + 4y) 2 2 0 H f (x, y, z) = 2 2 4 0 4 8 2 2 0 x 2x y Q Hf (x, y, z) = (x, y, z) H f (x, y, z) = (x, y, z) 2 2 4 y = (x, y, z) 2y 2x + 4z 0 4 8 z 4y + 8z = x(2x y) + y( 2y 2x + 4z) + z(4y + 8z) = 2f(x, y, z) Defn (Positive definite matrix) An n n symmetric matrix A and its quadratic form Q A (y) = y (Ay) is positive definite if Q A (y) > 0 for all nonzero vectors y 0 positive semi-definite if Q A (y) 0 for all vectors y negative definite if Q A (y) < 0 for all nonzero vectors y 0 negative semi-definite if Q A (y) 0 for all vectors y p 4 of 5
Now consider the Taylor s theorem for multiple dimensions: f(x) = f(x ) + f (x ) (x x ) + 1 2 (x x ) H f (z)(x x ) for some z between x, x Then if f has a critical point at x, so that f(x ) = 0 and f has continuous first and second partial derivatives on R n, then x is a a) global minimizer if H f (x) is positive semi-definite on R n b) strict global minimizer if H f (x) is positive definite on R n c) global maximizer if H f (x) is negative semi-definite on R n d) strict global maximizer if H f (x) is negative definite on R n and so the challenge lies in determining the form of the Hessian (as far as positive/negative (semi) definiteness (It s easier to rule out that a matrix doesn t satisfy the conditions above, than to prove that it does For instance, a positive definite matrix has only positive entries on the diagonal, and a negative definite matrix has only negative entries on the diagonal Thus the quadratic form in the example is neither positive nor negative definite more work would be required to establish if it was positive or negative semi-definite) p 5 of 5