January 3, 4 Math 36: Homework Solutions. We say that two norms and on a vector space V are equivalent or comparable if the topology they define on V are the same, i.e., for any sequence of vectors {x k } and x in V, x k x = if and only if k k x k x =. (a) Show that and are equivalent if and only if there exist C, C > such that for any x V. C x x C x = We are given that and are equivalent, and we want to show that this implies that C, C > such that C x x C x. Note that we can switch the definitions of the norms and get that C 3 x x C 4 x, which means that if we prove there exists C x x, we will also have proven that there exists C 3 x x. This implies (with C = C 3 ) that there exist C, C > such that C x x C x. Therefore, we only need to show that one of the constants exists. We do this by contradiction. Suppose that C >, x V such that C x > x. For each such C, x is a constant and hence we can say that C, x V such that: C x > x = x C x > x x = C > x x We construct a sequence of vectors {x k } by picking the appropriate x so that for C = k, C > x. In other words, for arbitrary k, we choose C = k and x k such that k > x k and x k = ( x x is a unit vector for each k under ). Now, since k k = and k > x k for each k, we deduce that k x k. Since the norm is always positive, we know that k x k =. As for the it of the other norm, because x k = for each k, we can conclude that k x k =. Because the two norms are equivalent, we know that k x k x = k x k x = for any sequence of vectors {x k } in V. But for our sequence we just saw that k x k = and k x k =. Therefore, we have a contradiction and we have shown that there must exist a C such that C x < x, and by our earlier logic this is sufficient to prove the claim. = We are given that C, C > such that C x x C x, and we want to show that this implies that and are equivalent. We want to prove that k x k x = = k x k x = for any sequence of vectors {x k } in V. Note that this will prove the entire if and only if statement because we can just switch the definition of the norms to give us the other direction. So, we begin by assuming that for any {x k }, k x k x =. This means that ɛ >, there exists an N such that if k > N, x k x < ɛ. We want to show that ɛ >, there exists an N such that if k > N, x k x < ɛ. Next, we observe that x k x is just some vector in V for any k. We are given that for any x V, C x < x. For each k, then, we get some C,k so that C,k x k x x k x. x ()
Now we fix ɛ, and then we set ɛ k = ɛ C,k for each k. We get N such that for k > N, x k x < ɛ k = ɛ C,k. Now we have for N = N: C,k x k x x k x < ɛ C,k = x k x < ɛ C,k C,k = x k x < ɛ () Therefore, we have shown that for k large enough, the value x k x will approach. This means that k x k x = = k x k x =, and by our logic earlier, we have hence proven the if and only if statement. (b) Show that any norm on R n is equivalent to the Euclidean norm E. Thus all norms on a finite dimensional vector space are equivalent, which means topologically there is no difference. (Hint: for any other norm, first show C E for some C >, which implies the function f(x) = x is continuous with respect to the topology on R n induced by the Euclidean norm. Recall that the unit ball in the Euclidean space is compact and verify the inequality in the other direction.) We begin by constructing the standard basis e i for R n, and letting be any norm on R n. We can write any x = (x, x,..., x n ) R n as a linear combination of the standard basis elements: x = n x i e i () Now we try to obtain an upper bound for x, the norm of x: n n n x = x i e i x i e i = x i e i () Note that we used the triangle inequality of in (). We make the expression in () bigger using the Cauchy-Schwartz inequality: n x x i e i n x i n e i = x E n e i (3) Notice that the term multiplying x E in (3) is a constant, so we define C = n e i. Therefore, we shown that x C x E. We claim that this implies that f(x) = x is continuous with respect to the topology on R n induced by the Euclidean norm. To show this, we need to prove that for all ɛ > there exists δ > such that x y E < δ = f(x) f(y) E < ɛ. Fix ɛ and let δ = ɛ C. f(x) f(y) E = x y E (def. of f) (4) x y (reverse triangle inequality) (5) C x y E ( x C x E ) (6) ɛ C = ɛ C (def. of δ) (7) Therefore, f(x) = x is continuous with respect to the Euclidean norm on R n. We also know that the unit ball in R n is compact that is, closed and bounded by the Heine-Borel theorem. We let B be the unit ball, i.e., B = {x R n : x E = }.
Since is continuous on R n and B is compact, it must be the case that for all y B there exists p B such that y p. Now, we can scale any x R n using the Euclidean norm so that they appear within B: x R n = x B = x x E x E p (8) For a given x, x E is just a constant, so if we look at x, x E we can just move the constant outside the norm: x x E = x = x x E p (9) x E Finally, taking C = p, we have that for any x R n, x C x E. Together with our earlier upper bound for x : C x E x C x E () Based on our work in part (a) of the question, this implies that is equivalent to E, i.e., any norm in R n is equivalent to the Euclidean norm. (c) Consider the norms f L = f(t) dt and f C = max t [,] { f(t) } on the space of C ([, ]) of continuous functions f : [, ] R. Show that the two norms are not equivalent. (Note that C ([, ]) is a vector space of infinite dimension.) Consider the following sequence of functions {f(t) k } defined by f(t) k = t k. Consider the following two its: f(t) k L k k t k dt k t k dt k t k+ k + k k + = () f(t) k C max k k t [,] { tk } = () k However, these its imply that for x =, k f(t) k x L k f(t) k x C, which means by the definition that the two norms are not equivalent.. Solve Pugh s Chapter 5, Problem #9, 5, 6, 8. 9) Give an example of two matrices such that the norm of the products is less than the product of the norms. Consider the following two matrices: A = [ ] First, let us calculate A. Let x R. [ ] [ ] A = sup A x = sup x = sup x [ x B = [ ] () ] = sup x = sup x = () Note that if x = (, ), then x = and we are restricted to vectors for which x. 3
Now, let us calculate B. Let x R. [ ] [ ] [ ] B = sup B x = sup x = sup x = sup x x = sup x = (3) Note that if x = (, ), then x = and we are restricted to vectors for which x. Now that we have A and B, we can calculate the product of the norms: A B = = (4) Next, we find the norm of the product. First we must find what A B is: [ ] [ ] [ ] A B = = (5) Now, we finally calculate A B. Let x R. [ ] [ ] A B = sup (A B) x = sup x = sup x [ = sup = (6) ] According to (4) and (6), we have found two matrices A and B such that A B < A B. 5) Show that the partial derivatives of the function xy f(x, y) = x + y if (x, y) (, ) if (x, y) = (, ) exist at the origin, but the function is not differentiable there. First we show that the partial derivatives exist at (, ): f(h, ) f(, ) (, ) x h f(, h) f(, ) (, ) y h h h + h h = () h +h h h = () That is, the partial derivatives exist at the origin and both are equal to there. Now we must show that, despite this, f is not differentiable at the origin. Assume that f is differentiable. Let h = (h, h ). For some linear transformation A of R into R, since f is differentiable: f( x + h) f( x A h) = (3) h Since we assumed f is differentiable at (, ), A must be equal to the matrix determined by its partial derivatives in () and (). [ ] A (,) = x y = [ ] (4) (,) Acting A (,) on h, we get: A (,) h = [ ] [ h h ] = (5) 4
Plugging in x = (, ) and (5) into (3): f( + h, + h ) f(, ) h + h h h h +h h + h h h (h + h (6) ) 3/ According to (3), the it in (6) has to equal from any direction. We choose h = (t, t) (approaching along the line y = x) and get: h h (h + h ) 3/ t t 3/ t 3 t /3 t (7) Therefore, our assumption of the it in (3) existing and equaling zero was wrong, which means that despite the fact that the partial derivatives of f exist at the origin, f is not differentiable there. 6) Let f : R R 3 and g : R 3 R be defined by f = (x, y, z) and g = w where w = w(x, y, z) = xy + yz + zx x = x(s, t) = st y = y(s, t) = s cos t z = z(s, t) = s sin t (a) Find the matrices that represent the linear transformations (Df) p and (Dg) q where p = (s, t ) = (, ) and q = f(p). First, note that f(s, t) = (x, y, z) = (st, s cos t, s sin t). Now, let us find the partial derivatives D j f i of f: D f = x s = x s = t D f = y s = y s = cos t D f 3 = z s = z s = sin t D f = x = x = s () D f = y = y = s sin t () D f 3 = z = z = s cos t (3) According to theorem 9.7 in Rudin, we construct Df and then evaluate at (, ): D f D f t s (Df) (,) = D f D f cos t s sin t = cos() (4) D f 3 D f 3 sin t s cos t sin() (,) Next, we note that g(x, y, z) = w = xy + yz + zx. Now we proceed similarly to find the partial derivatives D j g i of g: (,) D g = g x = y + z (5) D g = g y = x + z (6) D 3 g = g z = y + x (7) As before, we construct Dg and evaluate at q = f(p) = f(, ) = (, cos(), sin()) = (,, ): [ ] (Df) (,,) = g g g x y z = [ y + z x + z y + x ] (,,) (,,) = [ ] (8) Therefore, we have constructed the two matrices (Df) (,) and (Dg) (,,). 5
(b) Use the Chain Rule to calculate the matrix [/ s, /] that represents (D(g f)) p. According to the Chain Rule: s = x x s + y y s + z z s = x x + y y + z z (9) This is equivalent to multiplying the two matrices we found in part (a): [ ] D f D f (D(g f)) (,) = g g g x y z D f D f = [ ] cos() () (,,) D f 3 D f 3 sin() = (D(g f)) (,) = [ s In other words, both partial derivatives are equal to. ] (,) (,) = [ ] () (c) Plug the functions x = x(s, t), y = y(s, t), and z = z(s, t) directly into w = w(x, y, z), and recalculate [/ s, /], verifying the answer given in (b). First we find w in terms of s and t: Now we find s w = xy + yz + zx = (st)(s cos t) + (s cos t)(s sin t) + (s sin t)(st) = () = s t cos t + s sin t cos t + s t sin t (3) and s directly and evaluate at (, ): = (st cos t + s sin t cos t + st sin t) (,) = (4) (,) = (( st sin t + s cos t) + ( s sin t sin t + s cos t cos t) + (st cos t + s sin t)) (5) (,) = ( st sin t + s cos t s sin t + s cos t + st cos t + s sin t) (,) = (6) Therefore, we have confirmed our answers from (b) by finding the partials directly. (d) Examine the statements of the multivariable Chain Rules that appear in your old calculus book and observe that they are nothing more than the components of various product matrices. Observed. 8) The directional derivative of f : U R m at p U in the direction u is the it, if it exists, (Usually, one requires that u =.) p f(u) t f(p + tu) f(p) t (a) If f is differentiable at p, why is it obvious that the directional derivative exists in each direction u? If f is differentiable at p, we have by definition: f(p + h) f(p) Ah = () h 6
R(h) We define R(h) such that h = (as in Rudin), and then we can write that: f(p + h) f(p) A(h) = R(h) = f(p + h) = f(p) + A(h) + R(h) () Now, we let h = tu and we rewrite the directional derivative: f(p + tu) f(p) f(p) + A(tu) + R(tu) f(p) A(tu) + R(tu) p f(u) (3) Now, since R(h) h can write that: as h, we have that R(tu) tu Therefore, according to (3) and (4), we have that: as tu. Since tu as t, we R(tu) R(tu) R(tu) = (4) tu tu tu tu A(tu) + R(tu) A(tu) ta(u) p f(u) = A(u) (5) That is, the directional derivative both exists and equals A(u). (b) Show that the function f : R R defined by x 3 y f(x, y) = x 4 + y if (x, y) (, ) if (x, y) = (, ) has (,) f(u) = for all u but is not differentiable at (, ). We begin by showing that the directional derivative of f at the point p = (, ) is for any u = (u, u ): f((, ) + t(u, u )) f(, ) f(tu, tu ) (,) f(u) = () t (tu ) 3 (tu ) (tu ) 4 +(tu ) t t 3 u 3 tu t (t 4 u 4 + t u )t t t 4 u 3 u t 3 (t u 4 + u ) = () tu 3 u u 4 + u = u = (3) That is, the directional derivative is for any u. Now we show that this does not indicate that f is differentiable at (, ). First we show that the partial derivatives exist at (, ): f(h, ) f(, ) (, ) x h f(, h) f(, ) (, ) y h h 3 h 4 + h h = (4) h +h h h = (5) That is, the partial derivatives exist at the origin and both are equal to there. Now, assume that f is differentiable and let h = (h, h ). For some linear transformation A of R into R, since f is differentiable: f( x + h) f( x A h) = (6) h 7
We construct A (,) using the partial derivatives from (4) and (5): A (,) = [ x ] y = [ ] (7) (,) Therefore, we see that when we apply A to any vector h, we will get that A (,) h =. Plugging this result and x = (, ) into (6), we get: f( + h, + h ) f(, ) h + h h 3 h h 4 +h h + h h 3 h (h 4 + h ) h + h (8) According to (6), the it in (8) has to equal from any direction. We choose h = (t, t ) (approaching along the line y = x ) and get: h 3 h (h 4 + h ) h + h t t 3 t (t 4 + (t ) ) t + (t ) t t 5 t 4 t + t 4 = (9) t 5 t t 5 + t t + t = = () Therefore, our assumption of the it in (6) existing and equaling zero was wrong, which means that despite the fact that the directional derivative of f exists at the origin, f is not differentiable there. 8