1. First and Second Order Conditions for A Local Min or Max.

Size: px
Start display at page:

Download "1. First and Second Order Conditions for A Local Min or Max."

Transcription

1 1 MATH CHAPTER 3: UNCONSTRAINED OPTIMIZATION W. Erwin Diewert 1. First and Second Order Conditions for A Local Min or Max. Consider the problem of maximizing or minimizing a function of N variables, f(x 1,..., x N ) = f(x). From Chapter 1, we found that the first order necessary conditions for x 0 to be a local minimizer or maximizer for f were: (1) D v f(x 0 ) = 0 for all directions v 0 N where the first order directional derivative of f in the direction v evaluated at the point x is defined as (2) D v f(x) lim tæ0 [f(x + tv ) - f(x)] / t. In the case where the first order partial derivatives of f exist and are continuous around x 0, we found that conditions (1) were equivalent to the following N first order necessary conditions for x 0 to be a local minimizer or maximizer for f: (3) f 1 (x 0 ) = 0 ; f 2 (x 0 ) = 0 ;... ; f N (x 0 ) = 0 where the ith first order partial derivative of f is defined as (4) f i (x) lim tæ0 [f(x + te i ) - f(x)]/t; i = 1,..., N where e i is the ith unit vector. It is convenient to introduce a symbol to denote the vector of first order partial derivatives of f evaluated at the point x: (5) f(x) [f 1 (x), f 2 (x),..., f N (x)] T. Note that we have defined f(x) (called the gradient vector of f evaluated at x) to be column vector. Using the notation (5), the system of first order conditions (3) can be written more efficiently as: (6) f(x 0 ) = 0 N. Also using (5), it can be seen that our old First Order Directional Derivative Theorem can be written as: 1

2 2 (7) D v f(x 0 N ) = S i=1vi f i (x 0 ) = v T f(x 0 ). Recall that we required the first order partial derivative functions f i (x) to exist and be continuous around x 0, in order to derive the formula (7). It is also convenient to introduce a notation for the N by N matrix of second order partial derivatives of f evaluated at the point x: È f 11 (x),..., f 1N (x) (8) 2 f(x) Í :. :. Î Í f N1 (x),..., f NN (x) where the ijth element in 2 f(x) is defined as (9) f ij (x) lim tæ0 [f i (x + te j ) - f i (x)]/t, where f i (x) is the ith first order partial derivative of f evaluated at the point x. The N by N matrix 2 f(x) is called the Hessian matrix of f evaluated at x. Recall that the directional derivative of the function D v f(x) evaluated at x in the direction u 0 N is defined as: (10) D vu f(x) lim tæ0 [D v f(x + tu) - D v f(x)]/t. Using (8), it can be seen that our old Second Order Directional Derivative Theorem can be written as follows: (11) D vu f(x) = v T 2 f(x)u. Recall that in order to prove (11), we required the existence and continuity of the second order partial derivative functions f ij (x) around the point x. Armed with formula (11), we can state second order sufficient conditions for x 0 to be a strict local minimizer of f: in addition to the first order conditions (6), we require the following second order conditions: (12) D vv f(x 0 ) = v T 2 f(x 0 ) v > 0 for all v 0 N. If it is convenient, we can replace v 0 N in (12) by v T v = 1 in order to obtain an equivalent set of conditions. Since conditions (6) are equivalent to conditions (1), it can be seen that conditions (6) and (12) are analogues to our single variable calculus conditions for a strict local minimum, except that these univariate conditions now have to hold for all possible directions v. 2

3 3 The counterpart to the univariate second order necessary conditions for x 0 to be a local minimizer for f are conditions (6) plus the following second order conditions: (13) D vv f(x 0 ) = v T 2 f(x 0 ) v 0 for all v 0 N. Obviously, there are analogous sufficient conditions for x 0 to be a strict local maximizer for f: in addition to (6), we require (14) D vv f(x 0 ) = v T 2 f(x 0 ) v < 0 for all v 0 N. Finally, the analogous necessary conditions for x 0 to be a local maximizer for f are (6) and the following second order conditions: (15) D vv f(x 0 ) = v T 2 f(x 0 ) v 0 for all v 0 N. Notice that if N 2, then the second order conditions (12) - (15) involve checking an infinite number of inequalities. In Chapter 1, we have shown how this task can be accomplished in the case where N = 2. In section 4 below, we will show how to do this checking of inequalities for a general N. However, before we do this, it is useful to develop a few properties of quadratic functions. 2. Taylor's Theorem and Quadratic Approximations Taylor's Theorem: Let f(x) be a function of one variable defined over the interval x 0 x x 1 where x 0 < x 1. Suppose the n-1 derivative of f, f (n-1) (x), exists and is continuous over this interval and suppose that the nth derivative of f, f ( n) (x), exists for x such that x 0 < x < x 1. Define the "remainder" R by the following equation: (16) f(x 1 ) = f(x 0 n-1 (x ) + S 1 -x 0 ) k k =1 k! f (k) (x 0 ) + R. Then there exists a point x* such that x 0 < x* < x 1 and (17) R = (x 1 - x 0 ) n f ( n) (x*) / n!. Proof: Define the number M by the following equation: (18) f(x 1 ) = f(x 0 n-1 (x ) + S 1 -x 0 ) k k =1 k! f (k) (x 0 ) + M (x1 -x 0 ) n n!. Define the function F(x) by: (19) F(x) -f(x 1 ) + f(x) + Â k=1 n-1 [(x - x 0 ) k /k!] f (k) (x) + M(x - x 0 ) n /n!. 3

4 4 It can be seen that F(x 1 ) = 0 and by using (18), it can be seen that F(x 0 ) = 0 as well. Thus the function F(x) is continuous for x such that x 0 x x 1 and F is such that F(x 0 ) = F(x 1 ). Thus the function F must attain a local min or a local max for at least one x* such that x 0 < x* < x 1. The first order necessary conditions for a min or a max of F(x) must hold at x = x* so we have 0 = F (x*) = f (x*) + S k =1 n-1 k (x 1 -x*) k- 1 k! + Mn (x1 -x*) n -1 n! (-1) (-1)f (k) n-1 (x (x*) + S 1 -x*) k k =1 k! differentiating the F defined by (19) f (k+1) (x*) = f (x*) - f (x*) - (x 1 - x*)f (2) (x*) - (x1 -x *) 2 2! f (3) (x*) (x1 -x *) n -2 (n-2)! f (n-1) (x*) 2! + (x 1 - x*)f (2) (x*) + (x1 -x *) 2 f (n) (x*) - (x 1 -x *) n -1 (n- 1)! M (n-2)! f (3) (x*) (x1 -x *) n -1 (n- 1)! (20) = (x 1 - x*) n-1 [f (n) (x*) - M]/(n-1)! cancelling terms. Since x 0 < x* < x 1 and hence x 1 - x* > 0, we see that (20) implies (21) f (n) (x*) = M. Now substitute (21) into (18) and we obtain (16) where R is defined by (17). Q.E.D. Note that Taylor's Theorem reduces to the Mean Value Theorem if we set n = 1. Where n = 2, Taylor's Theorem becomes, letting x 1 be replaced by x: (22) f(x) = f(x 0 ) + (x - x 0 ) f (x 0 ) + R. If we drop the remainder term R on the right hand side of (22), what is left is called the linear approximation to f around the point x 0, i.e., define (23) l(x) f(x 0 ) + (x - x 0 ) f (x 0 ). Then for x "reasonably" close to x 0, l(x) will approximate f(x) "reasonably" well: (24) f(x 0 ) + (x - x 0 ) f (x 0 ). Note that l(x 0 ) = f(x 0 ) and l (x 0 ) = f (x 0 ); i.e., the linear approximation to f around the point x 0 has the same level and first derivative as f when evaluated at x = x 0. 4

5 5 Figure 1: The Linear Approximation to f at x 0 f(x) approximation f(x 0 ) l(x) = linear f(x) x 0 x When n = 3, Taylor's Theorem becomes, letting x 1 be replaced by x: (25) f(x) = f(x 0 ) + (x - x 0 ) f (x 0 ) + (1/2)(x - x 0 ) 2 f (x 0 ) + R. If we drop the remainder term R on the right hand side of (25), what is left is called the quadratic approximation to f around the point x 0 : (26) q(x) f(x 0 ) + (x - x 0 ) f (x 0 ) + (1/2)(x - x 0 ) 2 f (x 0 ). Note that the quadratic approximation to f(x) around the point x 0 will have the same level and first and second derivatives evaluated at x = x 0 ; i.e., we have (27) q(x 0 ) = f(x 0 ) ; q (x 0 ) = f (x 0 ) ; q (x 0 ) = f (x 0 ). The quadratic approximation to f around the point x 0 will generally more closely approximate f around x 0 than the corresponding linear approximation. Figure 2: The Quadratic Approximation to f at x 0 f(x), q(x) q(x) x 0 f(x) 5

6 6 The concepts of linear and quadratic approximations to general nonlinear functions can be extended to functions of N variables using the univariate analysis developed above. Let f(x) = f(x 1, x 2,..., x N ) be a function of N variables with continuous first and second order partial derivatives. Now use f in order to define the following function of a single variable t: (28) g(t) f(x 0 + t(x - x 0 )) ; 0 t 1. Thus we have: (29) g(0) = f(x 0 ) and g(1) = f(x). Now apply the linear approximation idea to g around the point t = 0. Thus we have: g(0) + g (0)(t-0) = f(x 0 N ) + S i=1 f i (x 0 )(x i - x 0 )(t-0) differentiating (28) with respect to t and evaluating the derivatives at t = 0 (30) = f(x 0 ) + t T f(x 0 )(x - x 0 ) rearranging terms. Letting t = 1 and using (29), (30) becomes: (31) f(x 0 ) + T f(x 0 )(x - x 0 ) and the right hand side of (31) can be regarded as the linear approximation to f(x) around x = x 0. In order to calculate the quadratic approximation to g(t) around t = 0, we need to calculate the first and second derivatives of g(t). Differentiating (28) with respect to t, we obtain: N (32) g (t) = S i=1 f i (x 0 + t(x - x 0 0 ))(x i - x i ) = T f (x 0 + t(x - x 0 ))(x - x 0 ); N N (33) g (t) = S i=1sj=1 fij (x 0 + t(x - x ))(x i - x i )(xj - x j ) = (x - x 0 ) T 2 f(x 0 + t(x - x 0 ))(x - x 0 ). Thus the quadratic approximation to g around the point t = 0 is: 6

7 7 (34) g(0) + g (0) (t - 0) + (1/2) g (0)(t - 0) 2 (35) = f(x 0 ) + t T f(x 0 )(x - x 0 ) + (1/2)t 2 (x - x 0 ) T 2 f(x 0 )(x - x 0 ) where (35) follows from (34) using (29), (32) and (33). Now evaluate (35) at t = 1 and using (29), we have: (36) f(x 0 ) + T f(x 0 )(x - x 0 ) + (1/2)(x - x 0 ) T 2 f(x 0 )(x - x 0 ). Note that the right hand side of (36) is a quadratic function of x; it is called the quadratic approximation to f(x) around the point x = x 0. Note that Young's Theorem ( f ij (x 0 ) = f ji (x 0 ) ) for all i j) implies that the N by N matrix 2 f(x 0 ) in (36) will be symmetric. Linear and quadratic approximations to general nonlinear functions of N variables are widespread in economics, science, engineering, statistics and business. 3. Rules for Differentiating Linear and Quadratic Functions. Suppose f(x) is a linear function of N variables; i.e., N (37) f(x) a + S i=1 bi x i = a + b T x where b T [b 1, b 2,..., b N ]. Partially differentiating the f(x) defined by (37) with respect to the components of x yields: (38) f i (x) = f(x)/ x i = b i ; i = 1, 2,..., N. Obviously, equations (38) can be rewritten as: (39) f(x) = b if f(x) a + b T x; Rule 1. If we further differentiate the f i (x) defined by (38) with respect to the components of x, we obtain: (40) f ij (x) 2 f(x)/ x i x j = 0; 1 i, j N. The equations (40) can be written more compactly as: (41) 2 f(x) = 0 N N if f(x) a + b T x; Rule 2. Now suppose f(x) is the following (homogeneous) quadratic function of N variables; i.e., 7

8 8 N N (42) f(x) S i=1sj=1 aij x i x j = x T Ax where A = A T. Note that we are assuming that the matrix of coefficients A [a ij ] in (42) is symmetric; i.e., we have a ij = a ji for all i j. We want to calculate the first and second order partial derivatives of the f defined by (42). Let us first consider the case N = 2. In this case, taking into account the fact that a 12 = a 21, we have: (43) f(x 1, x 2 ) = a 11 x a12 x 1 x 2 + a 22 x 2 2. The first order partial derivatives of (43) are: (44) f 1 (x 1, x 2 ) = 2a 11 x 1 + 2a 12 x 2 ; f 2 (x 1, x 2 ) = 2a 12 x 1 + 2a 22 x 2. Equations (44) can be written as: (45) f(x) = 2Ax if f(x) x T Ax, A = A T ; Rule 3. If we partially differentiate the f i (x 1, x 2 ) in (44) with respect to x 1 and x 2, we obtain the following second order derivatives: (46) f 11 (x 1, x 2 ) = 2a 11 ; f 12 (x 1, x 2 ) = 2a 12 ; f 21 (x 1, x 2 ) = 2a 12 ; f 22 (x 1, x 2 ) = 2a 22. Using matrices, equations (46) can be rewritten as: (47) 2 f(x) = 2A if f(x) x T Ax, A = A T ; Rule 4. It can be verified that Rules 3 and 4 hold for a general N and not only the cases N = 1 and N = 2. Rules 1 to 4 are extremely useful and should be memorized. Problems: 1. Verify Rules 3 and 4 for the case N = Consider the following system of equations: (i) y = Xx + e where y and e are M dimensional vectors, X is an M by N matrix and x is an N dimensional vector. Define the function f(x) as 8

9 9 f(x) e T M 2 e = S i=1ei = (y - Xx) T (y - Xx) using (i) to solve for e = (y T - x T X T )(y - Xx) = y T y - y T Xx - x T X T y + x T X T Xx (ii) = y T y - 2y T Xx + x T X T Xx using y T Xx = [x T X T y] T. Assume that (X T X) -1 exists. (a) Show that x ˆ (X T X) -1 X T y satisfies the system of first order conditions for minimizing f(x): (iii) f( ˆ x ) = 0 N. [In statistics, ˆ x is known as the least squares estimator for the vector of parameters x]. (b) Show that 2 f(x) does not depend on x. (c) (iv) Show that v T 2 f( ˆ x ) v > 0 for every v 0 N and so x ˆ is in fact a local minimizer for f(x). [This part of the problem is difficult]. This problem shows that the least squares estimator x ˆ actually does minimize the sum of squared errors e T e with respect to the vector of coefficients x. 4. Quadratic Forms and Definite Matrices Let A be an N by N symmetric matrix and consider the following definitions: (48) A is positive definite iff x T Ax > 0 for all x 0 N ; (49) A is negative definite iff x T Ax < 0 for all x 0 N ; (50) A is positive semidefinite iff x T Ax 0 for all x 0 N ; (51) A is negative semidefinite iff x T Ax 0 for all x 0 N ; (52) A is indefinite iff it is none of (48) - (51). Recall the second order conditions (12) - (15) that were discussed in section 1 above. If we let the A matrix in this section equal 2 f(x 0 ) in section 1, it can be 9

10 10 seen that (48) corresponds to conditions (12) for a strict local minimum, (49) corresponds to conditions (14) for a strict local maximum, (50) corresponds to the second order necessary conditions (13) for a local minimum and (51) corresponds to the second order necessary conditions (15) for a local maximum. In the following section, we show how the Gaussian triangularization procedure can be adapted to determine whether a symmetric matrix A has any of the definiteness properties (48) - (52). Problem: 3. Let D be an N by N diagonal matrix with main diagonal elements d ii for i = 1, 2,..., N. Determine what restrictions the d ii must satisfy in order for D to be: (i) positive definite; (ii) negative definite; (iii) positive semidefinite; (iv) negative semidefinite and (v) indefinite; (assume N 2 for this case). 5. The Method of Lagrange and Gauss for Diagonalizing a Symmetric Matrix Recall the Gaussian triangularization procedure that was discussed in section 3 of Chapter 2 on Elementary Matrix Algebra. If A is a symmetric N by N matrix, then this algorithm can readily be modified to transform A into a diagonal matrix. Consider Stage 1 of our old algorithm where we added multiples of one row of A to other rows of A to create zeros below the first component of the first column of A. We again apply Stage 1 of our old algorithm, but before we proceed to Stage 2, we now add multiples of the final Stage 1 first column to the remaining columns of the transformed A matrix to create zeros in the remainder of row 1. In other words, we repeat the sequence of elementary row operations that we used to accomplish Stage 1 of the algorithm but now we apply the same sequence to the columns as well. More explicitly, consider the 3 cases for Stage 1 of our old algorithm. In case (i), we had a 11 0, and at the of Stage 1, the transformed A matrix had the following form (the E n represent elementary row operation matrices that add multiples of the first row of A to the remaining rows of A): (53) È a 11, a 12,..., a 1N Î Í 0 N-1, A (2) = E NE N-1... E 2 A. Now add -a 12 /a 11 times the first column of (53) to the second column of (53); add -a 13 /a 11 times the first column of (53) to the 3rd column of (53);...; add -a 1N /a 11 times the first column of (53) to the Nth column of (53). It can be verified that these elementary column operations can be performed by multiplying (53) on the right by E 2 T E3 T... EN T ; i.e., the transposes of the sequence of row operation 10

11 11 matrices E 2, E 3,..., E N sweep out a 12 = a 21, a 13 = a 31,..., a 1N = a N1. Thus we have T T (54) E N E N-1... E 2 AE 2 E3... T È EN = a 11, 0 N-1 Î Í 0 N-1, A * at the end of our new Stage 1 Algorithm for case (i) where a If we take transposes of both sides of (54), we deduce that the matrix on the left hand side of (54) is symmetric. Hence A* on the right hand side of (54) must also be symmetric. Hence, we can now apply the next stage of our modified algorithm to the N-1 by N-1 symmetric matrix A*. Now suppose that at Stage 1 of our old algorithm, case (iii) occurred, i.e., a i1 = 0 for i = 1, 2,..., N. But since A is now assumed to be symmetric, we have a 1j = 0 as well for j = 1, 2,..., N. Thus in case (iii), A has the following form: T (55) A = È 0, 0 T N-1 Î Í 0 N-1, A * which is the required form for the next stage of our modified algorithm. Finally, suppose that at Stage 1 of our old algorithm, case (ii) occurred; i.e., a 11 = 0 but a i1 0 for some i > 1. Recall that in our old algorithm, we added row i of A to the first row of A and then applied the case (i) operations to the transformed matrix. In the present algorithm, we not only add row i of A to row 1, we then immediately add column i of the transformed matrix to column 1. The resulting matrix will be symmetric with the element 2a i1 + a ii in the northwest corner of the transformed matrix. We now need to consider 2 cases: Case (a): 2a i1 + a ii 0 In this case, we can now apply our new case (i) algorithm on the previous page to this transformed matrix. If we denote E 1 as the elementary row matrix that adds row i of A to row 1, then we have the following decomposition at the end of Stage 1 of our new algorithm: T T (56) E N E N-1... E 2 E 1 AE 1 E2... T È EN = 2a i1 + a ii, 0 N-1 Î Í 0 N-1, A * ; i.e., we have again reduced A into block diagonal form where A* is a symmetric N-1 by N-1 matrix. Case (b): 2a i1 + a ii = 0. T 11

12 12 In this case, if we look at row 1 and row i and column 1 and column i of the original A matrix, this 2 by 2 submatrix of A has the following form (using a 11 = 0 and a ii = - 2a i1 ): È 0, a i1 Î a i1, -2a i1. After adding row i of the original matrix to row 1 and then adding column i of the original matrix to column 1, the above 2 by 2 submatrix is transformed into: È 0, -a i1 Î -a i1, -2a i1 so we have not succeeded in getting a nonzero element in the northwest corner of the transformed matrix. However, to solve this problem, all we have to do is add row i of the transformed matrix to row 1 and then add column i of the transformed matrix to column 1. Then the new transformed A matrix will be symmetric and have - 4a i1 0 in the top northwest corner. Hence in this case (b), we can again obtain a counterpart to (56) where - 4a i1 will replace 2a i1 + a ii in the northwest corner of the matrix on the right hand side of (56). Hence in both cases (a) and (b), we have again reduced A into block diagonal form where A* is an N-1 by N-1 symmetric matrix. Hence, for all cases, at the end of Stage 1 of our new algorithm, we have reduced A into the following block diagonal form: (57) È d 11, T 0 N-1 Í Î 0 N-1, A * At Stage 2 of the algorithm, we apply the same type of elementary row and column operations to the symmetric matrix A* and at the end of Stage 2, we have reduced A* into the following form: (58) A* = È d 22, 0 T N-2 Í Î 0 N-2, A * * where A** is an N-2 by N-2 symmetric matrix. Now further reduce A** into block diagonal form; etc. Finally, at the end of Stage N, we have transformed A into diagonal form by means of a sequence of elementary row and column operations where we add multiples of one row to another row and then repeat the same operation to the corresponding columns. If we let the N by N matrix E denote the product of all of the elementary row matrices, then we have 12

13 13 (59) EAE T = D where D = [d ij ] and d ij = 0 if i j. È Example: Let A Í 0 Î Stage 1: We are in case (i): a 11 = 1 0. Hence take -2 times row 1 and add to row 2; take -3 times row 1 and add to row 3. We obtain the following matrix: È 1, 2, 3 Í 0, -3, -6 Î 0, -6, -8. Now take -2 times column 1 and add to column 2; take -3 times column 1 and add to column 3; get: È 1, 0, 0 Í 0, -3, -6 Î 0, -6, -8. Stage 2: Now take -2 times row 2 and add to row 3; get: È 1, 0, 0 Í 0, -3, -6. Î 0, 0, 4 Finally, take -2 times column 2 and add to column 3; get: (60) D = È 1, 0, 0 Í 0, -3, 0, a diagonal matrix. Î 0, 0, 4 The two elementary row matrices that we used at Stage 1 of the algorithm were: È 1, 0, 0 È 1, 0, 0 (61) E 1 Í -2, 1 0 ; E 2 Í 0, 1, 0. Î 0, 0, 1 Î -3, 0, 1 The final elementary row matrix that we used at Stage 2 of the algorithm was: È 1, 0, 0 (62) E 3 = Í 0, 1, 0. Î 0, -2, 1 13

14 14 Problem: 4. Define E = E 3 E 2 E 1 where E 3 is defined by (62) and E 1 and E 2 are defined in (61). Show that EAE T = D where D is defined by (60). The matrix E and the diagonal matrix D which occurs in the Lagrange-Gauss diagonalization procedure (see (59) above) can be used to determine whether the symmetric A satisfies any of the definiteness properties (48) - (52). Consider the E matrix which occurs in (59). Since E is a product of elementary row matrices, each of which has determinant equal to 1, it can be seen that (63) E = E T = 1. Since E T = 1, (E T ) -1 exists. Now for each x 0 N, consider the y defined by (64) y = (E T ) -1 x. Suppose y = 0 N. Then premultiplying both sides of (64) by E T leads to x = 0 N which contradicts x 0 N. Hence if x 0 N, then the y defined by (64) also satisfies y 0 N. Let x 0 N and define y by (64). Premultiplying both sides of (64) by E T leads to (65) x = E T y where y 0 N. Hence for x 0 N, we have x T Ax = (E T y) T A(E T y) using (65) = y T EAE T y = y T Dy using (59) N 2 (66) = Âi=1 d ii y i. Thus necessary and sufficient conditions for A to be positive definite are: (67) d ii > 0 for i = 1, 2,..., N. Using (66) and (49), it can be seen that necessary and sufficient conditions for A to be negative definite are: (68) d ii < 0; i = 1,..., N. Similarly, necessary and sufficient conditions for A to be positive semidefinite are: (69) d ii 0; i = 1,..., N. 14

15 15 Finally, necessary and sufficient conditions for A to be negative semidefinite are: (70) d ii 0; i = 1,..., N. Problem: 5. Let A = È Î 2. Which of the definiteness properties (48) - (52) does A satisfy? Historical Note: The above reduction of a quadratic form x T Ax to a sum of squares y T Dy was accomplished by J.-L. Lagrange (1759), "Researches sur la métode de moximis et miniouis", Miscellanea Taurinensi, 1, for the cases N = 2 and N = 3. Carl Friedrich Gauss described the general algorithm in 1810; see his Theory of the Combination of Observations Least Subject to Errors, G.W. Stewart, translator, SIAM Classics in Applied Mathematics, This publication indicates that Gauss arrived at the principle of least squares estimation in 1794 or 1795 but the French mathematician A.M. Legendre independently derived the principle (and named it) in 1805 and actually published the method before Gauss. 6. Checking Second Order Conditions Using Determinants Let A = [a ij ] be an N by N symmetric matrix and suppose that we want to check whether A is a positive definite matrix. If A is positive definite, then it must be the case that a 11 > 0. Why is this? By the definition of A being positive definite, (48) above, we must have x T Ax > 0 for all x 0 N. Let x = e 1, the first unit vector. Then if A is positive definite, we must have (71) e 1 T Ae1 = a 11 > 0. We can rewrite (71) using determinantal notation. Since the determinant of a one by one matrix is simply equal to the single element, (71) is equivalent to: (72) a 11 > 0. Now if the N by N matrix A is positive definite, it can be seen that we must have T (73) 0 < [x 1,x 2, 0 N-2 T ]A[x 1, x 2,0 N-2 ] T = [x 1,x 2 ] È a 11 a 12 È x 1 Î a 12 a 22 Î x 2 for all x 1, x 2 such that [x 1, x 2 ] [0, 0]. This means that the top left corner 2 by 2 submatrix of A must also be positive definite if A is positive definite. Hence, by 15

16 16 the previous section, there exists a 2 by 2 elementary row matrix E (2) which, along with E (2)T, reduces the 2 by 2 submatrix of A into diagonal form. Using (71), it can be seen that the E (2) which will do the job is (74) E (2 ) 1, 0 È Î -a 12 /a 11, 1 and we have (75) E (2 ) È a 11 a 12 Î a 12 a 22 E(2)T = È d 11, 0 Î 0, d 22 where the d ii turn out to be: (76) d 11 a 11 ; 2 (77) d 22 a 22 - a 12 /a11. From the previous section, we know that necessary and sufficient conditions for the 2 by 2 submatrix of A to be positive definite are: (78) d 11 > 0; d 22 > 0. Since E (2) = 1, taking determinants on both sides of (75) yields: (79) a 11 a 12 a 12 a = d d = d 11 d 22 > 0 22 where the inequality follows from (78). Using (76) and (79), it can be seen that the determinantal conditions: (80) a 11 > 0; (81) a 11 a 12 a 12 a 22 > 0 are necessary and sufficient for conditions (78) which in turn are necessary and sufficient for the positive definiteness of the top left corner 2 by 2 submatrix of A. If the N by N matrix A is positive definite, it can be seen that we must have 16

17 17 T (82) 0 < [x 1,x 2, x 3,0 N-3 T ]A[x 1,x 2, x 3,0 N-3 ] T È a 11 a 12 a 13 È = [x 1,x 2,x 3 ] Í a 12 a 22 a 23 Í Î a 13 a 23 a 33 Î for all [x 1, x 2, x 3 ] [0, 0, 0]. This means that the top left corner 3 by 3 submatrix of A must also be positive definite. Hence there exists a 3 by 3 elementary row matrix E (3) with E (3) = 1 such that x 1 x 2 x 3 (83) E (3 ) a È (3) È 11 a 12 a 13 d 11, 0, 0 Í a 12 a 22 a 23 E (3)T = Í (3) 0, d 22, 0 Î a 13 a 23 a 33 Í Î 0, 0, d (3) 33, (3) where the d ii satisfy: (3) (84) d 11 > 0; (3) d22 > 0; (3) d33 > 0. Since E (3) = 1, taking determinants on both sides of (83) yields (85) (3) a 11 a 12 a d 13 11, 0, 0 (3) a 12 a 22 a 23 = 0, d 22, 0 a 13 a 23 a 33 (3) 0, 0, d 33 (3) (3) (3) = d 11 d22 d33 > 0 where the inequality in (85) follows from (84). (3) When A is positive definite, we need to show that the d 11 and (3) d22 which occur in (83) - (85) are the same as the d 11 and d 22 which occurred in (76) - (79). But this is obviously true using the Gaussian diagonalization algorithm: when we diagonalize the 3 by 3 submatrix of A, we must first diagonalize the 2 by 2 (3) submatrix of A and hence the d 11 and (3) d22 in (83) will equal the d 11 and d 22 which occurred in (75). Hence, we can rewrite (85) as follows: (86) a 11 a 12 a 13 d a 12 a 22 a 23 = 0 d 22 0 = d 11 d 22 d 33 > 0; a 13 a 23 a d 33 i.e., we have dropped the superscripts on the d ii. Now it can be seen that the determinantal inequalities (80), (81) and 17

18 18 (87) a 11 a 12 a 13 a 12 a 13 a 22 a 23 a 23 > 0 a 33 along with the equalities in (76), (79) and (86) are necessary and sufficient for the inequalities (88) d 11 > 0, d 22 > 0, d 33 > 0 which in turn are necessary and sufficient to for the top left 3 by 3 submatrix of A to be positive definite. Obviously, the above process can be continued until we obtain the following N determinantal conditions which are necessary and sufficient for the N by N symmetric matrix A to be positive definite: (89) a 11 > 0; a 11 a 12 a 12 a 22 > 0; a 11 a 12 a 13 a 12 a 13 a 22 a 23 a 23 > 0;...; A > 0. a 33 How can we adapt the above analysis to obtain conditions for A to be negative definite? Obviously, the Gaussian diagonalization procedure can again be used: the only difference in the analysis will be that the diagonal elements d ii must all be negative in the case where A is negative definite. This means that the determinantal conditions in (89) that involve an odd number of rows and columns of A must have their signs changed, since these determinants will equal the product of an odd number of the d ii. Hence the following N determinantal conditions are necessary and sufficient for the N by N symmetric matrix A to be negative definite: (90) a 11 < 0; a 11 a 12 a 12 a 22 > 0; a 11 a 12 a 13 a 12 a 22 a 23 a 13 a 23 a 33 < 0;...;(-1) N A > 0. Turning now to determinantal conditions for positive semidefiniteness or negative seimdefiniteness, one might think that the conditions are a straightforward modification of conditions (89) and (90) respectively, where the strict inequalities (>) are replaced by weak inequalities ( ). Unfortunately, this thought is incorrect as the following example shows. Example: A È Í Î In this case, we see that a 11 = 0 = 0; 18

19 19 a 11 a 12 a 12 a 22 = = 0 and A = 0. Hence the weak inequality form of conditions (89) and (90) are both satisfied so we might want to conclude that this A is both positive and negative semidefinite. However, this is not so: A is indefinite since e 2 T Ae2 = a 22 = 1 > 0 and e 3 T Ae3 = a 33 = -1 < 0. The problem with this example is that all of the elements in the first row and column of A are zero and hence d 11 is zero. Now look back at the inequalities (79) and (86): it can be seen that if d 11 = 0, then these inequalities are no longer valid. However, if instead of always picking submatrices of A that included the first row and column of A, we picked submatrices of A that excluded the first row and column, then we would discover that the submatrix of A which consisted of rows 2 and 3 and columns 2 and 3 is indefinite; i.e., we have (91) a 22 = 1 > 0 and a 22 a 23 a 23 a 33 = -1 < 0. In order to determine whether A is positive semidefinite, we replace the strict inequalities in (89) by weak inequalities but the resulting weak inequalities must be checked for all possible choices of the rows of A; i.e., necessary and sufficient conditions for A to be positive semidefinite are: (92) a ii = a ii 0 for i = 1, 2,..., N; a i 1 i 1 a i1 i 2 a i1 i 2 a i2 i 2 0 for i i 1 < i 2 N; a i 1 i 1 a i1 i 2 a i 1 i 3 a i1 i 2 a i2 i 2 a i 2 i 3 0 for 1 i 1 < i 2 < i 3 N; a i1 i 3 a i2 i 3 a i 3 i 3 :. A 0. In the 2 by 2 case, conditions (92) boil down to the following 3 conditions: 2 (93) a 11 0; a 22 0; A = a 11 a 22 - a In the 3 by 3 case, conditions (92) reduce to the following 7 determinantal conditions: 19

20 20 (94) a 11 0; a 22 0; a 33 0; a 11 a 12 a 12 a 22 0; a 11 a 13 a 13 a 33 0; a 22 a 23 a 23 a 33 0; A 0. If A is a symmetric N by N matrix, then necessary and sufficient determinantal conditions for A to be negative semidefinite are: (95) (-1) a ii 0; i = 1, 2,..., N; (-1) 2 a i 1 i 1 a i1 i 2 a i1 i 2 a i2 i 2 0; 1 i 1 < i 2 N; Problems: a i 1 i 1 (-1) 3 a i1 i 2 a i1 i 2 a i2 i 2 a i 1 i 3 a i 2 i 3 0; 1 i 1 < i 2 < i 3 N; a i1 i 3 a i2 i 3 a i 3 i 3 :. (-1) N A Let A È Î 0. Use the Gaussian diagonalization procedure to determine the definiteness properties of A. 7. Does the A defined in problem 6 above satisfy the determinantal conditions (93) for positive semidefiniteness? 8. Solve max x1, x 2 {f(x 1, x 2 ): x 1 > 0, x 2 > 0} (if possible) where f is defined as follows: (a) f(x 1,x 2 ) -x x1 x 2 - x x1 + x 2 (b) f(x 1, x 2 ) ln x 1 + ln x 2 + x 1 x 2-2x 1-2x 2. Check second order conditions when appropriate. 9. Consider the following 2 input, 1 output profit maximization problem: (i) max y,x1,x 2 {py - w 1 x 1 - w 2 x 2 : y = f(x 1, x 2 ) } where f is the producer's production function, w i > 0 is the price of input i and p > 0 is the price of output. The unconstrained maximization problem that is equivalent to (i) is: 20

21 21 (ii) max x1,x 2 { pf(x 1, x 2 ) - w 1 x 1 - w 2 x 2 }. Assume that f is twice continuously differentiable and x 1 * = d 1 (p, w 1, w 2 ) > 0 and x 2 * = d 2 (p, w 1, w 2 ) > 0 solve (ii) and that the first and second order sufficient conditions for a strict local maximum are satisfied at this point x 1 *, x2 *. Note that the producer's supply function y* = s(p, w 1, w 2 ) can be determined as a function of the two input demand functions d 1 and d 2 : (iii) s(p, w 1, w 2 ) f[d 1 (p, w 1, w 2 ), d 2 (p, w 1, w 2 )]. (a) Try to determine the signs of the following derivatives: s(p, w 1, w 2 )/ p; d 1 (p, w 1, w 2 )/ w 1 ; d 2 (p, w 1, w 2 )/ w 2. (b) Prove that: d 1 (p, w 1, w 2 )/ w 2 = d 2 (p, w 1, w 2 )/ w 1. (c) Prove that: s(p, w 1, w 2 )/ w 1 = - d 1 (p, w 1, w 2 )/ p. Note: (b) and (c) are Hotelling symmetry conditions. Hint: Look at the 2 first order conditions for (ii). Differentiate these 2 equations with respect to p; you will obtain a system of 2 equations involving the unknown derivatives d 1 (p, w 1, w 2 )/ p and d 2 (p, w 1, w 2 )/ p. Now differentiate the 2 first order conditions with respect to w 1 ; you will obtain a system of 2 equations involving the derivatives d 1 (p, w 1, w 2 )/ w 1 and d 2 (p, w 1, w 2 )/ w Let F È f 11 Î f 21 f 12 f 22 (i) f 11 < 0; be a symmetric matrix that satisfies the conditions: 2 (ii) f 11 f 22 - f 12 > 0. Show that the following inequality holds: (iii) -f f 12 - f 22 > 0. Hint: -f f 12 - f 22 = -[1, -1] È f 11 f 12 Î f 12 f 22 È 1 Î Consider a simple two sector model for the production sector of an economy. Sector 1 (the "service" section) produces aggregate consumption C using an intermediate input M ("manufactured" goods) and inputs of labour L 1 according to the production function f: (i) C = f(m, L 1 ). 21

22 22 Sector 2 (the "manufacturing" sector produces the intermediate output M using inputs of labour L 2 according to the production function (ii) M = L 2. (Each sector can use other primary inputs such as capital, land or natural resource inputs, but since we hold these other inputs fixed in the short run, we suppress mention of them in the above notation). There is an aggregate labour constraint in the economy: (iii) L 1 + L 2 = L > 0 where L is fixed. The manufacturer gets the revenue p > 0 for each unit of manufacturing output produced but the government puts a positive tax t > 0 on the sale of each unit of manufactures so that the service sector producer faces the price p(1 + t) for each unit of M used. The service sector producer is assumed to be a competitive profit maximizer; i.e., * M* = M(t) and L 1 = L1 (t) is the solution to: (iv) max M,L1 {f(m, L 1 ) - p(1 + t)m - wl 1 } where w > 0 is the wage rate and the price of the consumption good is 1. We assume that the following first and second order conditions for the unconstrained maximization problem (iv) are satisfied: * (v) f 1 (M*, L 1) - p(1 + t) = 0; * (vi) f 2 (M*, L 1) - w = 0; * (vii) f 11 * (viii) f 22 * f 11 (M*, L 1) < 0; * f 22 (M*, L 1) < 0; * * * (xi) f 11 f22 - (f 12 ) 2 * > 0 where f 12 * f12 (M*, L 1). One more equation is required; namely we assume that the price of the manufactured good is equal to the wage rate; i.e., we have: (x) p = w. Equation (x) is consistent with profit maximizing behavior in the manufacturing sector assuming that the production function (ii) is valid. 22

23 23 Now substitute equations (ii), (iii) and (x) into the first order conditions (v) and (vi) and we obtain the following two equations which characterize equilibrium in this simplified economy: (xi) f 1 ( L - L 1 (t), L 1 (t)] - w(t)(1 + t) = 0; (xii) f 2 [ L - L 1 (t), L 1 (t)] - w(t) = 0; where the 2 unknowns in (xi) and (xii) are L 1 (t) (employment in the service sector) and w(t) (the wage rate faced by both sectors) which are regarded as functions of the manufacturer's sales tax t. (a) Differentiate (xi) and (xii) with respect t and solve the resulting two equations for the derivatives L 1 (t) and w (t). (b) Show that L 1 (0) > 0. Hint: Use part (a) and problem 10 above. Consumption regarded as a function of the level of sales taxation is defined as follows: (xiii) C(t) f[ L - L 1 (t), L 1 (t)] (c) Show that C (t) = -tw(t) L 1 (t). Hint: Use (xi) - (xiii). (d) Compute C (0) and C (0). Hint: Use part (c). Now we can use the derivatives in part (d) above to calculate a second order Taylor series approximation to C(t); i.e., we have (xiv) C(0) + C (0)t + (1/2) C (0)t 2. (e) Treat (xiv) as an exact equality and show that C(t) < C(0). Hint: Use parts (b) and (d). Comment: This problem shows that in general, the aggregate net output of the entire production sector falls if transactions between sectors are taxed. There are many applications of this result. Note that (xiv) shows that the loss of output is proportional to the square of the tax rate, t 2. (f) Suppose that the government now subsidizes the output of the manufacturing sector; i.e., t is now negative instead of being positive. Can we still conclude that C(t) < C(0)? This problem shows you that you now have the mathematical tools that will enable you to construct simple models that cast some light on real life, practical economic problems. 23

24 24 7. Linearly Homogeneous Functions and Euler's Theorem Let f(x 1,..., x N ) f(x) be a function of N variables defined over the positive orthant, W {x: x >> 0 N }. Note that x >> 0 N means that each component of x is positive while x 0 N means that each component of x is nonnegative. Finally, x > 0 N means x 0 N but x 0 N (i.e., the components of x are nonnegative and at least one component is positive). (96) Definition: f is (positively) linearly homogeneous iff f (l x) = l f(x) for all l > 0 and x >> 0 N. (97) Definition: f is (positively) homogeneous of degree a iff f(l x) = l a f(x) for all l > 0 and x >> 0 N. We often assume that production functions and utility functions are linearly homogeneous. If the producer's production function f is linearly homogeneous, then we say that the technology is subject to constant returns to scale; i.e., if we double all inputs, output also doubles. If the production function f is homogeneous of degree a < 1, then we say that the technology is subject to diminishing returns to scale while if a > 1, then we have increasing returns to scale. Functions that are homogeneous of degree 1, 0 or -1 occur frequently in index number theory. Recall the profit maximization problem (i) in Problem 9 above. The optimized objective function, p(p, w 1, w 2 ), in that problem is called the firm's profit function and it turns out to be linearly homogeneous in (p, w 1, w 2 ). For another example of a linearly homogeneous function, consider the problem which defines the producer's cost function. Let x 0 N be a vector of inputs, y 0 be the output produced by the inputs x and let y = f(x) be the producer's production function. Let p >> 0 N be a vector of input prices that the producer faces, and define the producer's cost function as (98) C(y, p) min x 0N {p T x: f(x) y}. It can readily be seen, that for fixed y, C(y, p) is linearly homogeneous in the components of p; i.e., let l > 0, p >> 0 N and we have (99) C(y, lp) min x 0N {lp T x: f(x) y} lmin x 0N {p T x: f(x) y} using l > 0 lc(y, p). 24

25 25 Now recall the definition of a linearly homogeneous function f given by (96). We have the following two very useful theorems that apply to differentiable linearly homogeneous functions. Euler's First Theorem: If f is linearly homogeneous and once continuously differentiable, then its first order partial derivative functions, f i (x) for i = 1, 2,..., N, are homogeneous of degree zero and N (100) f(x) = S i=1xi f i (x) = x T f(x). Proof: Partially differentiate both sides of the equation in (96) with respect to x i ; we get for i = 1, 2,..., N: (101) f i (lx) l = lf i (x) for all x >> 0 N and l > 0, or (102) f i (lx) = f i (x) = l 0 f i (x) for all x >> 0 N and l > 0. Using definition (97) for a = 0, we see that equation (102) implies that f i is homogeneous of degree 0. To establish (100), partially differentiate both sides of the equation in (96) with respect to l and get: N (103) S i=1fi (lx 1, lx 2,..., lx N ) (lx i )/ l = f(x) or N fi (lx 1, lx 2,..., lx N )x i = f(x). S i=1 Now set l = 1 in (103) to obtain (100). Q.E.D. Euler's Second Theorem: If f is linearly homogeneous and twice continuously differentiable, then the second order partial derivatives of f satisfy the following N linear restrictions: for i = 1,..., N: N (104) S j=1 fij (x)x j = 0 for x (x 1,..., x N ) T >>0. The restrictions (104) can be rewritten as follows: (105) 2 f(x)x = 0 N for every x >>0 N. Proof: For each i, partially differentiate both sides of equation (102) with respect to l and get for i = 1, 2,..., N: (106) N S j=1 fij (lx 1,..., lx N ) (lx j )/ l = 0 or N S j=1 fij (lx)x j = 0. 25

26 26 Now set l = 1 in (106) and the resulting equations are equations (104). Q.E.D. Problems: 12. [Shephard's Lemma]. Suppose that the producer's cost function C(y, p) is defined by (98) above. Suppose that when p = p* >> 0 N and y = y* > 0, x* > 0 N solves the cost minimization problem, so that (i) p* T x* = C(y*, p*) min x {p* T x: f(x) y*}. (a) Suppose further that C is differentiable with respect to the input prices at (y*, p*). Then show that (ii) x* = p C(y*, p*). Hint: Because x* solves the cost minimization problem defined by C(y*, p*) by hypothesis, then x* must be feasible for this problem so we must have f(x*) y*. Thus x* is a feasible solution for the following cost minimization problem where the general input price vector p >> 0 N has replaced the specific input price vector p* >> 0 N : (iii) C(y*, p) min x {p T x: f(x) y*} p T x* where the inequality follows from the fact that x* is a feasible (but usually not optional) solution for the cost minimization problem in (iii). Now define for each p >> 0 N : (iv) g(p) p T x* - C(y*, p). Use (i) and (iii) to show that g(p) is minimized (over all p such that p >> 0 N ) at p = p*. Now recall the first order necessary conditions for a minimum. (b) Under the hypotheses of part (a), suppose x** > 0 N is another solution to the cost minimization problem defined in (i). Then show x* = x**; i.e., the solution to (i) is unique under the assumption that C(y*, p*) is differentiable with respect to the components of p. 13. Suppose C(y, p) defined by (98) is twice continuously differentiable with respect to the components of the input price vector p and let the vector x(y, p) solve (98); i.e., x(y, p) [x 1 (y, p),..., x N (y, p)] T is the producer's system of cost minimizing input demand functions. Define the N by N matrix of first order partial derivatives of the x i (y, p) with respect to the components of p as: 26

27 27 (i) A [ x i (y, p 1,..., p N )/ p j ] ( p x(y, p)). Show that: (ii) A = A T and (iii) Ap = 0 N. Hint: By the previous problem, x(y, p) p C(y, p). Recall also (99) and Euler's Second Theorem. Comment: The restrictions (ii) and (iii) above were first derived by J.R. Hicks (1939), Value and Capital, Appendix to Chapters II and III, part 8 and P.A. Samuelson (1947), Foundations of Economic Analysis, page 69. The restrictions (ii) on the input demand derivatives x i / p j are known as the Hicks-Samuelson symmetry conditions. So far, we have developed two methods for checking the second order conditions that arise in unconstrained optimization theory: (i) the Lagrange-Gauss diagonalization procedure explained in section 5 above and (iii) the determinantal conditions method explained in section 6 above. In the final sections of this chapter, we are going to derive a third method: the eigenvalue method. Before we can explain this method, we require some preliminary material on complex numbers. 8. Complex Numbers and the Fundamental Theorem of Algebra (107) Definition: i is an algebraic symbol which has the property i 2 = -1. Hence i can be regarded as the square root of -1; i.e., -1 i. (108) Definition: A complex number z is a number which has the form z = x + iy where x and y are ordinary real numbers. The number x is called the real part of z and the number y is called the imaginary part of z. We can add and multiply complex numbers. To add two complex numbers, we merely add their real parts and imaginary parts to form the sum; i.e., if z 1 x 1 + iy 1 and z 2 = x 2 + iy 2, then (109) z 1 + z 2 = [x 1 + iy 1 ] + [x 2 + iy 2 ] (x 1 + x 2 ) + (y 1 + y 2 ) i. To multiply together two complex numbers z 1 and z 2, we multiply them together using ordinary algebra, replacing i 2 by -1; i.e., (110) z 1 z 2 = [x 1 + iy 1 ] [x 2 + iy 2 ] = x 1 x 2 + iy 1 x 2 + ix 1 y 2 + i 2 y 1 y 2 27

28 28 = x 1 x 2 + i 2 y 1 y 2 + (x 1 y 2 + x 2 y 1 )i (x 1 x 2 - y 1 y 2 ) + (x 1 y 2 + x 2 y 1 )i. Two complex numbers are equal iff their real parts and imaginary parts are identical; i.e., if z 1 = x 1 + iy 1 and z 2 = x 2 + iy 2, then z 1 = z 2 iff x 1 = x 2 and y 1 = y 2. The final definition we require in this section is the definition of a complex conjugate. (111) Definition: If z = x + iy, then the complex conjugate of z, denoted by z, is defined as the complex number x - iy; i.e., z x - iy. An interesting property of a complex number and its complex conjugate is given in Problem 15 below. Problems: 14. Let a 3 + i; b 1 + 5i and c 5-2i. Calculate ab-c. Note that we have written a b as ab. 15. Show that z z 0 for any complex number z = x + iy. 16. Let z 1 = x 1 + iy 1 and z 2 = x 2 + iy 2 be two complex numbers calculate z 3 = z 1 z 2. Show that z 3 = z 1 z 2 ; i.e., the complex conjugate of a product of two complex numbers is equal to the product of the complex conjugates. Now let f(x) be a polynomial of degree N; i.e., (112) f(x) a 0 + a 1 x + a 2 x a N x N where a N 0, where the fixed numbers a 0, a 1, a 2,..., a N are ordinary real numbers. If we try to solve the equation f(x) = 0 for real roots x, then it can happen that no real roots to this polynomial equation exist; e.g., consider (113) 1 + x 2 = 0 so that x 2 = -1 and no real roots to (113) exist. However, note that if we allow solutions x to (113) to be complex numbers, then (113) has the roots x 1 = i and x 2 = -i. In general, if we allow solutions to the equation f(x) = 0 (where f is defined by (112)) to be complex numbers, then there are always N roots to the equation (some of which could be repeated or multiple roots). (114) Fundamental Theorem of Algebra: Every polynomial equation of the form, a 0 + a 1 x a 2 x a N x N = 0 (with a N 0) has N roots or solutions, x 1, x 2,..., x N, where in general, the x i are complex numbers. This is one of the few theorems which we will not prove in this course. For a 28

29 29 proof, see J.V. Uspensky, Theory of Equations. 9. The Eigenvalues and Eigenvectors of a Symmetric Matrix Let A be a general N by N matrix; i.e., it is not restricted to be symmetric at this point. (115) Definition: l is a eigenvalue of A with the corresponding eigenvector z [z 1, z 2,..., z N ] T 0 N iff l and z satisfy the following equation: (116) Az = lz; z 0 N. Note that the eigenvector z which appears in (116) is not allowed to be a vector of zeros. In the following theorem, we restrict A to be a symmetric matrix. In the case of a general N by N nonsymmetric A matrix, the eigenvalue l which appears in (116) is allowed to be a complex number and the eigenvector z which appears in (116) is allowed to be a vector of complex numbers; i.e., z is allowed to have the form z = x + iy where x and y are N dimensional vectors of real numbers. (117) Theorem: Every N by N symmetric matrix A has N eigenvalues l 1, l 2,..., l N where these eigenvalues are real numbers. Proof: The equation (116) is equivalent to: (118) [A - li N ]z = 0 N ; z 0 N. Now if [A - li N ] -1 were to exist, then we could premultiply both sides of (118) by this inverse matrix and obtain: (119) [A - li N ] -1 [A - li N ]z = [A - li N ] -1 0 N = 0 N or z = 0 N. But z = 0 N is not admissible as an eigenvector by definition (115). From our earlier material on determinants, we know that [A - li N ] -1 exists iff A - li N 0. Hence, in order to hope to find a l and z 0 N which satisfy (116), we must have: (120) A - li N = 0. If N = 2, the determinantal equation (120) becomes: (121) 0 = È a 11, a 12 Î a 12, a 22 - È l, 0 l 0 Î 29

30 30 = a 11 - l, a 12 a 12, a 22 - l 2 = (a 11 - l)(a 22 - l) - a 12, which is a quadratic equation in l. In the general N by N case, if we expand out the determinantal equation (120), we obtain an equation of degree N in l of the form b 0 + b 1 l +b 2 l b N l N = 0 and by the Fundamental Theorem of Algebra, this polynominal equation has N roots, l 1, l 2,..., l N say. Once we have found these eigenvalues l i, we can obtain corresponding eigenvectors z i 0 N by solving (122) [A - l i I N ]z i = 0 N ; i = 1, 2,..., N for a nonzero vector z i. (We will show exactly how this can be done later). However, both the eigenvalues l i and the eigenvectors z i can have complex numbers as components in general. We now show that the eigenvalues and eigenvectors have real numbers as components when A = A T. Suppose that l 1 is an eigenvalue of A (where l 1 = a 1 + b 1 i say) and z 1 = x 1 + iy 1 is the corresponding eigenvector. Since z 1 0 N, at least one component of the x 1 and y 1 vectors must be nonzero. Thus letting z 1 x 1 - iy 1 be the vector of complex conjugates of the components of z 1, we have z 1T z 1 = [x 1T + iy 1T ] [x 1 - iy1] = x 1T x 1 - i 2 y 1T y 1 - ix 1T y 1 + iy 1T x 1 = x 1T x 1 + y 1T y 1 - i[x 1T y 1 - y 1T x 1 ] = x 1T x 1 + y 1T y 1 since x 1T y 1 = y 1T x 1 N 1 = S i=1(xi ) 2 + N 1 Si=1(yi ) 2 (123) > 0 where the inequality follows since at least one of the x i 1 or yi 1 is not equal to zero and hence its square is positive. By the definition of l 1 and z 1 being an eigenvalue and eigenvector of A, we have: (124) Az 1 = l 1 z 1. Since A is a real matrix, the matrix of complex conjugates of A, A, is A. Now take complex conjugates on both sides of (124). Using A = A and Problem 16 above we obtain: 30

31 31 (125) A z 1 = l 1 z 1. Premultiply both sides of (124) by z 1T and we obtain the following equality: (126) z 1T Az 1 = l 1 z 1T z 1. Now take transposes of both sides of (126) and we obtain: (127) l 1 z 1T z 1 = z 1T A T z 1 = z 1T A z 1 where the second equality in (127) follows from the symmetry of A; i.e., A = A T. Now premultiply both sides of (125) by z 1T and obtain: (128) l 1 z 1T z 1 = z 1T A T z 1. Since the right hand sides of (127) and (128) are equal, so are the left hand sides so we obtain the following equality: (129) l 1 z 1T z 1 = l 1 z 1T z 1. Using (123), we see that z 1T z 1 is a positive number so we can divide both sides of (129) by z 1T z 1 to obtain: (130) l 1 = a 1 + b 1 i = l 1 = a 1 - b 1 i, which in turn implies that the imaginary part of l 1 must be zero; i.e., we find that b 1 = 0 and hence the eigenvalue l 1 must be an ordinary real number. To find a real eigenvector z 1 = x 1 + i0 N = x 1 0 N that corresponds to the eigenvalue l 1, define the N by N matrix B 1 as (131) B 1 A - l 1 I N. We know that B 1 = 0 and we need to find a vector x 1 0 N such that B 1 x 1 = 0 N. Apply the Gaussian triangularization algorithm to B 1. This leads to an elementary row matrix E 1 with E 1 = 1 and (132) E 1 B 1 = U 1 where U 1 is an upper triangular N by N matrix. Since B 1 = 0, taking determinants on both sides of (132) leads to U 1 = 0 and hence at least one of 1 the N diagonal elements u ii of U 1 1 must be zero. Let u i1 i 1 be the first such zero diagonal element. We choose the components of the x 1 1 vector as follows: let x i1 31

Thus necessary and sufficient conditions for A to be positive definite are:

Thus necessary and sufficient conditions for A to be positive definite are: 14 Problem: 4. Define E = E 3 E 2 E 1 where E 3 is defined by (62) and E 1 and E 2 are defined in (61). Show that EAE T = D where D is defined by (60). The matrix E and the diagonal matrix D which occurs

More information

This property turns out to be a general property of eigenvectors of a symmetric A that correspond to distinct eigenvalues as we shall see later.

This property turns out to be a general property of eigenvectors of a symmetric A that correspond to distinct eigenvalues as we shall see later. 34 To obtain an eigenvector x 2 0 2 for l 2 = 0, define: B 2 A - l 2 I 2 = È 1, 1, 1 Î 1-0 È 1, 0, 0 Î 1 = È 1, 1, 1 Î 1. To transform B 2 into an upper triangular matrix, subtract the first row of B 2

More information

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form: 17 4 Determinants and the Inverse of a Square Matrix In this section, we are going to use our knowledge of determinants and their properties to derive an explicit formula for the inverse of a square matrix

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

By W.E. Diewert. December 13, 2008

By W.E. Diewert. December 13, 2008 1 APPLIED ECONOMICS By W.E. Diewert. December 13, 2008 Chapter 9: Flexible Functional Forms 1. Introduction In this chapter, we will take an in depth look at the problems involved in choosing functional

More information

Digital Workbook for GRA 6035 Mathematics

Digital Workbook for GRA 6035 Mathematics Eivind Eriksen Digital Workbook for GRA 6035 Mathematics November 10, 2014 BI Norwegian Business School Contents Part I Lectures in GRA6035 Mathematics 1 Linear Systems and Gaussian Elimination........................

More information

OR MSc Maths Revision Course

OR MSc Maths Revision Course OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision

More information

3 (Maths) Linear Algebra

3 (Maths) Linear Algebra 3 (Maths) Linear Algebra References: Simon and Blume, chapters 6 to 11, 16 and 23; Pemberton and Rau, chapters 11 to 13 and 25; Sundaram, sections 1.3 and 1.5. The methods and concepts of linear algebra

More information

6.1 Matrices. Definition: A Matrix A is a rectangular array of the form. A 11 A 12 A 1n A 21. A 2n. A m1 A m2 A mn A 22.

6.1 Matrices. Definition: A Matrix A is a rectangular array of the form. A 11 A 12 A 1n A 21. A 2n. A m1 A m2 A mn A 22. 61 Matrices Definition: A Matrix A is a rectangular array of the form A 11 A 12 A 1n A 21 A 22 A 2n A m1 A m2 A mn The size of A is m n, where m is the number of rows and n is the number of columns The

More information

Mathematical Economics (ECON 471) Lecture 3 Calculus of Several Variables & Implicit Functions

Mathematical Economics (ECON 471) Lecture 3 Calculus of Several Variables & Implicit Functions Mathematical Economics (ECON 471) Lecture 3 Calculus of Several Variables & Implicit Functions Teng Wah Leo 1 Calculus of Several Variables 11 Functions Mapping between Euclidean Spaces Where as in univariate

More information

Index. Cambridge University Press An Introduction to Mathematics for Economics Akihito Asano. Index.

Index. Cambridge University Press An Introduction to Mathematics for Economics Akihito Asano. Index. , see Q.E.D. ln, see natural logarithmic function e, see Euler s e i, see imaginary number log 10, see common logarithm ceteris paribus, 4 quod erat demonstrandum, see Q.E.D. reductio ad absurdum, see

More information

Tutorial Code and TA (circle one): T1 Charles Tsang T2 Stephen Tang

Tutorial Code and TA (circle one): T1 Charles Tsang T2 Stephen Tang Department of Computer & Mathematical Sciences University of Toronto at Scarborough MATA33H3Y: Calculus for Management II Final Examination August, 213 Examiner: A. Chow Surname (print): Given Name(s)

More information

September Math Course: First Order Derivative

September Math Course: First Order Derivative September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which

More information

Mathematical Economics. Lecture Notes (in extracts)

Mathematical Economics. Lecture Notes (in extracts) Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter

More information

Chapter 2: Unconstrained Extrema

Chapter 2: Unconstrained Extrema Chapter 2: Unconstrained Extrema Math 368 c Copyright 2012, 2013 R Clark Robinson May 22, 2013 Chapter 2: Unconstrained Extrema 1 Types of Sets Definition For p R n and r > 0, the open ball about p of

More information

Math (P)refresher Lecture 8: Unconstrained Optimization

Math (P)refresher Lecture 8: Unconstrained Optimization Math (P)refresher Lecture 8: Unconstrained Optimization September 2006 Today s Topics : Quadratic Forms Definiteness of Quadratic Forms Maxima and Minima in R n First Order Conditions Second Order Conditions

More information

Functions of Several Variables

Functions of Several Variables Functions of Several Variables The Unconstrained Minimization Problem where In n dimensions the unconstrained problem is stated as f() x variables. minimize f()x x, is a scalar objective function of vector

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Production Possibility Frontier

Production Possibility Frontier Division of the Humanities and Social Sciences Production Possibility Frontier KC Border v 20151111::1410 This is a very simple model of the production possibilities of an economy, which was formulated

More information

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form: 0.1 N. L. P. Katta G. Murty, IOE 611 Lecture slides Introductory Lecture NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP does not include everything

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky Lecture 2 INF-MAT 4350 2009: 7.1-7.6, LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky Tom Lyche and Michael Floater Centre of Mathematics for Applications, Department of Informatics,

More information

Chapter 2: Matrix Algebra

Chapter 2: Matrix Algebra Chapter 2: Matrix Algebra (Last Updated: October 12, 2016) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). Write A = 1. Matrix operations [a 1 a n. Then entry

More information

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as

Here each term has degree 2 (the sum of exponents is 2 for all summands). A quadratic form of three variables looks as Reading [SB], Ch. 16.1-16.3, p. 375-393 1 Quadratic Forms A quadratic function f : R R has the form f(x) = a x. Generalization of this notion to two variables is the quadratic form Q(x 1, x ) = a 11 x

More information

ECON2285: Mathematical Economics

ECON2285: Mathematical Economics ECON2285: Mathematical Economics Yulei Luo FBE, HKU September 2, 2018 Luo, Y. (FBE, HKU) ME September 2, 2018 1 / 35 Course Outline Economics: The study of the choices people (consumers, firm managers,

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Matrices 2. Slide for MA1203 Business Mathematics II Week 4

Matrices 2. Slide for MA1203 Business Mathematics II Week 4 Matrices 2 Slide for MA1203 Business Mathematics II Week 4 2.7 Leontief Input Output Model Input Output Analysis One important applications of matrix theory to the field of economics is the study of the

More information

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0. Matrices Operations Linear Algebra Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0 The rectangular array 1 2 1 4 3 4 2 6 1 3 2 1 in which the

More information

WI1403-LR Linear Algebra. Delft University of Technology

WI1403-LR Linear Algebra. Delft University of Technology WI1403-LR Linear Algebra Delft University of Technology Year 2013 2014 Michele Facchinelli Version 10 Last modified on February 1, 2017 Preface This summary was written for the course WI1403-LR Linear

More information

Linear Algebra: Characteristic Value Problem

Linear Algebra: Characteristic Value Problem Linear Algebra: Characteristic Value Problem . The Characteristic Value Problem Let < be the set of real numbers and { be the set of complex numbers. Given an n n real matrix A; does there exist a number

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Basics of Calculus and Algebra

Basics of Calculus and Algebra Monika Department of Economics ISCTE-IUL September 2012 Basics of linear algebra Real valued Functions Differential Calculus Integral Calculus Optimization Introduction I A matrix is a rectangular array

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015

Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015 Midterm for Introduction to Numerical Analysis I, AMSC/CMSC 466, on 10/29/2015 The test lasts 1 hour and 15 minutes. No documents are allowed. The use of a calculator, cell phone or other equivalent electronic

More information

Symmetric Matrices and Eigendecomposition

Symmetric Matrices and Eigendecomposition Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions

More information

Math Matrix Algebra

Math Matrix Algebra Math 44 - Matrix Algebra Review notes - (Alberto Bressan, Spring 7) sec: Orthogonal diagonalization of symmetric matrices When we seek to diagonalize a general n n matrix A, two difficulties may arise:

More information

Assignment 1: From the Definition of Convexity to Helley Theorem

Assignment 1: From the Definition of Convexity to Helley Theorem Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x

More information

Elementary Linear Algebra

Elementary Linear Algebra Matrices J MUSCAT Elementary Linear Algebra Matrices Definition Dr J Muscat 2002 A matrix is a rectangular array of numbers, arranged in rows and columns a a 2 a 3 a n a 2 a 22 a 23 a 2n A = a m a mn We

More information

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a

x +3y 2t = 1 2x +y +z +t = 2 3x y +z t = 7 2x +6y +z +t = a UCM Final Exam, 05/8/014 Solutions 1 Given the parameter a R, consider the following linear system x +y t = 1 x +y +z +t = x y +z t = 7 x +6y +z +t = a (a (6 points Discuss the system depending on the

More information

Duality. for The New Palgrave Dictionary of Economics, 2nd ed. Lawrence E. Blume

Duality. for The New Palgrave Dictionary of Economics, 2nd ed. Lawrence E. Blume Duality for The New Palgrave Dictionary of Economics, 2nd ed. Lawrence E. Blume Headwords: CONVEXITY, DUALITY, LAGRANGE MULTIPLIERS, PARETO EFFICIENCY, QUASI-CONCAVITY 1 Introduction The word duality is

More information

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey

Linear Algebra. Solving Linear Systems. Copyright 2005, W.R. Winfrey Copyright 2005, W.R. Winfrey Topics Preliminaries Echelon Form of a Matrix Elementary Matrices; Finding A -1 Equivalent Matrices LU-Factorization Topics Preliminaries Echelon Form of a Matrix Elementary

More information

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming CSC2411 - Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming Notes taken by Mike Jamieson March 28, 2005 Summary: In this lecture, we introduce semidefinite programming

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

GATE Engineering Mathematics SAMPLE STUDY MATERIAL. Postal Correspondence Course GATE. Engineering. Mathematics GATE ENGINEERING MATHEMATICS

GATE Engineering Mathematics SAMPLE STUDY MATERIAL. Postal Correspondence Course GATE. Engineering. Mathematics GATE ENGINEERING MATHEMATICS SAMPLE STUDY MATERIAL Postal Correspondence Course GATE Engineering Mathematics GATE ENGINEERING MATHEMATICS ENGINEERING MATHEMATICS GATE Syllabus CIVIL ENGINEERING CE CHEMICAL ENGINEERING CH MECHANICAL

More information

MA22S3 Summary Sheet: Ordinary Differential Equations

MA22S3 Summary Sheet: Ordinary Differential Equations MA22S3 Summary Sheet: Ordinary Differential Equations December 14, 2017 Kreyszig s textbook is a suitable guide for this part of the module. Contents 1 Terminology 1 2 First order separable 2 2.1 Separable

More information

Linear Algebra. Linear Equations and Matrices. Copyright 2005, W.R. Winfrey

Linear Algebra. Linear Equations and Matrices. Copyright 2005, W.R. Winfrey Copyright 2005, W.R. Winfrey Topics Preliminaries Systems of Linear Equations Matrices Algebraic Properties of Matrix Operations Special Types of Matrices and Partitioned Matrices Matrix Transformations

More information

3 Matrix Algebra. 3.1 Operations on matrices

3 Matrix Algebra. 3.1 Operations on matrices 3 Matrix Algebra A matrix is a rectangular array of numbers; it is of size m n if it has m rows and n columns. A 1 n matrix is a row vector; an m 1 matrix is a column vector. For example: 1 5 3 5 3 5 8

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR

The Derivative. Appendix B. B.1 The Derivative of f. Mappings from IR to IR Appendix B The Derivative B.1 The Derivative of f In this chapter, we give a short summary of the derivative. Specifically, we want to compare/contrast how the derivative appears for functions whose domain

More information

Homework 2 Foundations of Computational Math 2 Spring 2019

Homework 2 Foundations of Computational Math 2 Spring 2019 Homework 2 Foundations of Computational Math 2 Spring 2019 Problem 2.1 (2.1.a) Suppose (v 1,λ 1 )and(v 2,λ 2 ) are eigenpairs for a matrix A C n n. Show that if λ 1 λ 2 then v 1 and v 2 are linearly independent.

More information

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints.

The general programming problem is the nonlinear programming problem where a given function is maximized subject to a set of inequality constraints. 1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point,

More information

EC487 Advanced Microeconomics, Part I: Lecture 2

EC487 Advanced Microeconomics, Part I: Lecture 2 EC487 Advanced Microeconomics, Part I: Lecture 2 Leonardo Felli 32L.LG.04 6 October, 2017 Properties of the Profit Function Recall the following property of the profit function π(p, w) = max x p f (x)

More information

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018

MATH 5720: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 2018 MATH 57: Unconstrained Optimization Hung Phan, UMass Lowell September 13, 18 1 Global and Local Optima Let a function f : S R be defined on a set S R n Definition 1 (minimizers and maximizers) (i) x S

More information

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1...

Matrices. Chapter What is a Matrix? We review the basic matrix operations. An array of numbers a a 1n A = a m1... Chapter Matrices We review the basic matrix operations What is a Matrix? An array of numbers a a n A = a m a mn with m rows and n columns is a m n matrix Element a ij in located in position (i, j The elements

More information

There are six more problems on the next two pages

There are six more problems on the next two pages Math 435 bg & bu: Topics in linear algebra Summer 25 Final exam Wed., 8/3/5. Justify all your work to receive full credit. Name:. Let A 3 2 5 Find a permutation matrix P, a lower triangular matrix L with

More information

Math Linear Algebra II. 1. Inner Products and Norms

Math Linear Algebra II. 1. Inner Products and Norms Math 342 - Linear Algebra II Notes 1. Inner Products and Norms One knows from a basic introduction to vectors in R n Math 254 at OSU) that the length of a vector x = x 1 x 2... x n ) T R n, denoted x,

More information

Introduction to Matrices

Introduction to Matrices POLS 704 Introduction to Matrices Introduction to Matrices. The Cast of Characters A matrix is a rectangular array (i.e., a table) of numbers. For example, 2 3 X 4 5 6 (4 3) 7 8 9 0 0 0 Thismatrix,with4rowsand3columns,isoforder

More information

MATH 4211/6211 Optimization Constrained Optimization

MATH 4211/6211 Optimization Constrained Optimization MATH 4211/6211 Optimization Constrained Optimization Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Constrained optimization

More information

Algebraic. techniques1

Algebraic. techniques1 techniques Algebraic An electrician, a bank worker, a plumber and so on all have tools of their trade. Without these tools, and a good working knowledge of how to use them, it would be impossible for them

More information

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway

Linear Algebra: Lecture Notes. Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway Linear Algebra: Lecture Notes Dr Rachel Quinlan School of Mathematics, Statistics and Applied Mathematics NUI Galway November 6, 23 Contents Systems of Linear Equations 2 Introduction 2 2 Elementary Row

More information

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is, 65 Diagonalizable Matrices It is useful to introduce few more concepts, that are common in the literature Definition 65 The characteristic polynomial of an n n matrix A is the function p(λ) det(a λi) Example

More information

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations. POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems

More information

5 More on Linear Algebra

5 More on Linear Algebra 14.102, Math for Economists Fall 2004 Lecture Notes, 9/23/2004 These notes are primarily based on those written by George Marios Angeletos for the Harvard Math Camp in 1999 and 2000, and updated by Stavros

More information

Eigenvalues and Eigenvectors: An Introduction

Eigenvalues and Eigenvectors: An Introduction Eigenvalues and Eigenvectors: An Introduction The eigenvalue problem is a problem of considerable theoretical interest and wide-ranging application. For example, this problem is crucial in solving systems

More information

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2

HW3 - Due 02/06. Each answer must be mathematically justified. Don t forget your name. 1 2, A = 2 2 HW3 - Due 02/06 Each answer must be mathematically justified Don t forget your name Problem 1 Find a 2 2 matrix B such that B 3 = A, where A = 2 2 If A was diagonal, it would be easy: we would just take

More information

1 Functions and Graphs

1 Functions and Graphs 1 Functions and Graphs 1.1 Functions Cartesian Coordinate System A Cartesian or rectangular coordinate system is formed by the intersection of a horizontal real number line, usually called the x axis,

More information

Chapter 3. Linear and Nonlinear Systems

Chapter 3. Linear and Nonlinear Systems 59 An expert is someone who knows some of the worst mistakes that can be made in his subject, and how to avoid them Werner Heisenberg (1901-1976) Chapter 3 Linear and Nonlinear Systems In this chapter

More information

Linear and non-linear programming

Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

More information

Review of Basic Concepts in Linear Algebra

Review of Basic Concepts in Linear Algebra Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013

UNCONSTRAINED OPTIMIZATION PAUL SCHRIMPF OCTOBER 24, 2013 PAUL SCHRIMPF OCTOBER 24, 213 UNIVERSITY OF BRITISH COLUMBIA ECONOMICS 26 Today s lecture is about unconstrained optimization. If you re following along in the syllabus, you ll notice that we ve skipped

More information

8. Diagonalization.

8. Diagonalization. 8. Diagonalization 8.1. Matrix Representations of Linear Transformations Matrix of A Linear Operator with Respect to A Basis We know that every linear transformation T: R n R m has an associated standard

More information

Economics 205 Exercises

Economics 205 Exercises Economics 05 Eercises Prof. Watson, Fall 006 (Includes eaminations through Fall 003) Part 1: Basic Analysis 1. Using ε and δ, write in formal terms the meaning of lim a f() = c, where f : R R.. Write the

More information

Chapter 9: Systems of Equations and Inequalities

Chapter 9: Systems of Equations and Inequalities Chapter 9: Systems of Equations and Inequalities 9. Systems of Equations Solve the system of equations below. By this we mean, find pair(s) of numbers (x, y) (if possible) that satisfy both equations.

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

( )! ±" and g( x)! ±" ], or ( )! 0 ] as x! c, x! c, x! c, or x! ±". If f!(x) g!(x) "!,

( )! ± and g( x)! ± ], or ( )! 0 ] as x! c, x! c, x! c, or x! ±. If f!(x) g!(x) !, IV. MORE CALCULUS There are some miscellaneous calculus topics to cover today. Though limits have come up a couple of times, I assumed prior knowledge, or at least that the idea makes sense. Limits are

More information

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then

Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then Lecture 2 - Unconstrained Optimization Definition[Global Minimum and Maximum]Let f : S R be defined on a set S R n. Then 1. x S is a global minimum point of f over S if f (x) f (x ) for any x S. 2. x S

More information

Written Examination

Written Examination Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes

More information

Next topics: Solving systems of linear equations

Next topics: Solving systems of linear equations Next topics: Solving systems of linear equations 1 Gaussian elimination (today) 2 Gaussian elimination with partial pivoting (Week 9) 3 The method of LU-decomposition (Week 10) 4 Iterative techniques:

More information

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize

More information

Practice Final Exam Solutions for Calculus II, Math 1502, December 5, 2013

Practice Final Exam Solutions for Calculus II, Math 1502, December 5, 2013 Practice Final Exam Solutions for Calculus II, Math 5, December 5, 3 Name: Section: Name of TA: This test is to be taken without calculators and notes of any sorts. The allowed time is hours and 5 minutes.

More information

Study Guide for Math 095

Study Guide for Math 095 Study Guide for Math 095 David G. Radcliffe November 7, 1994 1 The Real Number System Writing a fraction in lowest terms. 1. Find the largest number that will divide into both the numerator and the denominator.

More information

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim

Introduction - Motivation. Many phenomena (physical, chemical, biological, etc.) are model by differential equations. f f(x + h) f(x) (x) = lim Introduction - Motivation Many phenomena (physical, chemical, biological, etc.) are model by differential equations. Recall the definition of the derivative of f(x) f f(x + h) f(x) (x) = lim. h 0 h Its

More information

Ω R n is called the constraint set or feasible set. x 1

Ω R n is called the constraint set or feasible set. x 1 1 Chapter 5 Linear Programming (LP) General constrained optimization problem: minimize subject to f(x) x Ω Ω R n is called the constraint set or feasible set. any point x Ω is called a feasible point We

More information

NOTES ON CALCULUS OF VARIATIONS. September 13, 2012

NOTES ON CALCULUS OF VARIATIONS. September 13, 2012 NOTES ON CALCULUS OF VARIATIONS JON JOHNSEN September 13, 212 1. The basic problem In Calculus of Variations one is given a fixed C 2 -function F (t, x, u), where F is defined for t [, t 1 ] and x, u R,

More information

Copositive Plus Matrices

Copositive Plus Matrices Copositive Plus Matrices Willemieke van Vliet Master Thesis in Applied Mathematics October 2011 Copositive Plus Matrices Summary In this report we discuss the set of copositive plus matrices and their

More information

PROFIT FUNCTIONS. 1. REPRESENTATION OF TECHNOLOGY 1.1. Technology Sets. The technology set for a given production process is defined as

PROFIT FUNCTIONS. 1. REPRESENTATION OF TECHNOLOGY 1.1. Technology Sets. The technology set for a given production process is defined as PROFIT FUNCTIONS 1. REPRESENTATION OF TECHNOLOGY 1.1. Technology Sets. The technology set for a given production process is defined as T {x, y : x ɛ R n, y ɛ R m : x can produce y} where x is a vector

More information

The Singular Value Decomposition

The Singular Value Decomposition The Singular Value Decomposition Philippe B. Laval KSU Fall 2015 Philippe B. Laval (KSU) SVD Fall 2015 1 / 13 Review of Key Concepts We review some key definitions and results about matrices that will

More information

0.1 Rational Canonical Forms

0.1 Rational Canonical Forms We have already seen that it is useful and simpler to study linear systems using matrices. But matrices are themselves cumbersome, as they are stuffed with many entries, and it turns out that it s best

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Matrices and Matrix Algebra.

Matrices and Matrix Algebra. Matrices and Matrix Algebra 3.1. Operations on Matrices Matrix Notation and Terminology Matrix: a rectangular array of numbers, called entries. A matrix with m rows and n columns m n A n n matrix : a square

More information

Review of Optimization Methods

Review of Optimization Methods Review of Optimization Methods Prof. Manuela Pedio 20550 Quantitative Methods for Finance August 2018 Outline of the Course Lectures 1 and 2 (3 hours, in class): Linear and non-linear functions on Limits,

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

All of my class notes can be found at

All of my class notes can be found at My name is Leon Hostetler I am currently a student at Florida State University majoring in physics as well as applied and computational mathematics Feel free to download, print, and use these class notes

More information

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12

a 11 a 12 a 11 a 12 a 13 a 21 a 22 a 23 . a 31 a 32 a 33 a 12 a 21 a 23 a 31 a = = = = 12 24 8 Matrices Determinant of 2 2 matrix Given a 2 2 matrix [ ] a a A = 2 a 2 a 22 the real number a a 22 a 2 a 2 is determinant and denoted by det(a) = a a 2 a 2 a 22 Example 8 Find determinant of 2 2

More information

Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

More information

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v)

Contents. Set Theory. Functions and its Applications CHAPTER 1 CHAPTER 2. Preface... (v) (vii) Preface... (v) CHAPTER 1 Set Theory Definition of Set... 1 Roster, Tabular or Enumeration Form... 1 Set builder Form... 2 Union of Set... 5 Intersection of Sets... 9 Distributive Laws of Unions and

More information

ECON0702: Mathematical Methods in Economics

ECON0702: Mathematical Methods in Economics ECON0702: Mathematical Methods in Economics Yulei Luo SEF of HKU January 12, 2009 Luo, Y. (SEF of HKU) MME January 12, 2009 1 / 35 Course Outline Economics: The study of the choices people (consumers,

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

LINEAR SYSTEMS (11) Intensive Computation

LINEAR SYSTEMS (11) Intensive Computation LINEAR SYSTEMS () Intensive Computation 27-8 prof. Annalisa Massini Viviana Arrigoni EXACT METHODS:. GAUSSIAN ELIMINATION. 2. CHOLESKY DECOMPOSITION. ITERATIVE METHODS:. JACOBI. 2. GAUSS-SEIDEL 2 CHOLESKY

More information