1 Euclidean space R n

Size: px

Start display at page:

Download "1 Euclidean space R n"

Erick Stanley
6 years ago
Views:

1 1 Euclidean space R n We start the course by recalling prerequisites from the courses Hedva 1 and 2 and Linear Algebra 1 and Scalar product and Euclidean norm During the whole course, the n-dimensional linear space over the reals will be our home. It is denoted by R n. We say that R n is an Euclidean space if it is equipped with a scalar product, that is a function (x, y): R n R n R satisfying conditions (i) (x, x) ; (ii) (x, x) = if and only if x = ; (iii) (x, y) = (y, x); (iv) for any t R, (tx, y) = t(x, y); (v) (x + y, z) = (x, z) + (y, z). If {e 1,..., e n } is a basis in R n, and x = x i e i, y = y j e j, then (x, y) = i,j x iy j (e i, e j ). In particular, if {e 1,..., e n } is an orthonormal basis in R n, then (x, y) = x i y i. Having a chosen scalar product 1, we can define the length (or the Euclidean norm) of the vector x = (x, x), and the angle between two vectors: (x, y) < (x, y) = arccos x y. Of course, the angle < (x, y) is defined only for x, y. The norm enjoys the following properties: (a) x > for x ; (b) tx = t x for t R; (c) x + y x + y. Exercise 1.1. Prove the property (c). attains in (c). Describe when the equality sign There are many other functions. : R n R satisfying conditions (a), (b) and (c). They are also called the norms. For example, the l p -norm { n } 1/p def x p = x i p, 1 p <, i=1 1 Of course, there are many different scalar products in R n. If we fix one of them, scalar then any other has the form (x, y) = (Ax, y), where A is a positive linear operator in R n. 1

2 and x def = max 1 i n x i, here {x i } are the coordinates of x in a fixed basis in R n. It is easy to see that the l p -norm meets conditions (a) and (b). In the extreme cases p = 1 and p =, condition (c) is also obvious. Now, we sketch the proof of (c) for 1 < p <, splitting it into three simple exercises. Exercise 1.2. Let q be a dual exponent to p: 1 p + 1 q = 1, 1. Prove Young s inequality ab ap p + bq q, for a, b >. The equality sign attains for a p = b q only. Hint: maximize the function h(a) = ab ap. p 2. Prove Hölder s inequality ( ) 1/p ( ) 1/q a i b i a p i b q i, i i for a i, b i >. The equality sign attains only for a i /b i = const, 1 i n. Hint: assume, WLOG, that a p i = b q i = 1, i i and apply Young s inequality to a i b i. 3. Check that ( ) 1/p ( ) 1/p ( ) 1/p (a i + b i ) p a p i + b p i, Hint: write (a i + b i ) p = i i and apply Hölder s inequality. i i a i (a i + b i ) p 1 + i i i b i (a i + b i ) p 1, Having the norm., we define the distance ρ(x, y) = x y. In this course, we will mainly use the Euclidean norm 2, and the Euclidean distance. Normally, we denote the Euclidean norm by x. 2 Though, as we will see in the next lecture, all the norms in R n are equivalent and the choice of the norm does no affect the problems we are studying. 2

3 1.2 Open and closed subsets of R n Recall a familiar terminology: open ball B(a, r) = {x R n : x a < r}; closed ball B(a, r) = {x R n : x a r}; sphere S(a, r) = {x R n : x a = r} = B(a, r); brick {x R n : α i < x i < β i }. This brick is open. Sometimes, we ll need closed and semi-open bricks. complement (to the set E) E c = R n \ E. The next definition is fundamental: Definition 1.3. A set A R n is open, if for each a X there is an open ball B centered at a such that B X. A set X R n is closed if its complement is open. The empty set and the whole R n are open and closed at the same time. (These two subsets of R n are often called trivial subsets). Exercise 1.4. Union of any family of open sets is open, intersection of any family of closed sets is closed. Finite intersection of open sets is open, finite union of closed subsets is closed. Exercise 1.5. Give an example of infinite family of open sets with nontrivial and closed intersection, and an example of infinite family of closed sets with non-trivial and open union. Let us proceed further. Any open set O containing the point a is called a vicinity/neighbourhood of a, the set O \ {a} is called a punctured neighbourhood of a. The (open) ball B(a, δ) is called a δ-neighbourhood of a. Interiour inte is a sent of points a which belong to E with some neighbourhood. Exteriour exte is a set of points a having a neighbourhood which does not belong to E (equivalently, a belongs to the interiour of the complement E c ). Boundary E is a set of points which are neither interiour nor exteriour points of E. In other words, we always have a decomposition R n = inte E exte. Closure: E = E E. Equivalently, E is a union of E with the set of all accumulation points of E. Recall that x is an accumulation point of the set E if in any punctured neighbourhood of x there is at least one (and therefore, infinitely many) points of E. 3

4 Exercise The set E is closed if and only if it contains all its accumulation points; i.e. E = E. 2. The closure E is always closed, and the second closure coincides with the first one E = E. 3. Only trivial subsets are simultaneously closed and open. Convergence: a sequence {x k } R n converges to x if lim k x k x =. Equivalently 3, all coordinates must converge: x k i x i for 1 i n. The Cauchy criterion and the Bolzano-Weierstrass lemma are the same as in the 1D-case. Cauchy s criterion says that the sequence {x k } converges to x iff 4 for an arbitrary small ɛ > there exists a sufficiently large N such that for all k, m N, x k x m < ɛ. The BW-lemma says that any bounded sequence in R n has a convergent subsequence. Exercise 1.7. Prove Cauchy s criterion and the Bolzano-Weierstrass lemma. 1.3 Compact sets Definition 1.8. A set K R n is compact, if any sequence {x m } K has a subsequence {x m j } convergent to a point from K. Claim 1.9. A set K is compact if and only if it is closed and bounded. Proof: (a) Assume that K is compact. Then K must contain all its accumulation points and therefore K is closed. Assume that K is unbounded. Then there is a sequence {x m } K such that x m m. If a subsequence {x m j } converges to a point x, then by the triangle inequality x x m j x m j x m j ɛ +. Contradiction. (b) Assume that the set K is closed and bounded. By the BW-lemma, each sequence {x m } K has a convergent subsequence. Let x be its limit. Then x is an accumulation point of K, and since K is closed, x K. Thus, K is compact. 3 by the elementary inequality 4 iff = if and only if max x i y i x y n max x i y i. 1 i n 1 i n 4

5 Exercise 1.1. Any nested sequence of compact sets K 1 K 2... K j... has a non-empty intersection. Hint: consider a sequence {x j } such that x j K j. If the diameters of K j converge to zero, then the intersection is a singleton. Here, diameter(k) = max x y. x,y K Continue with definitions. Open covering of X: X j J U j, where the sets U j are open. If the set of indices J is finite, then the covering is called finite. The next lemma is fundamental: Lemma 1.11 (Heine - Borel). The set K is compact if and only if open covering of K a finite subcovering In topology, the boxed formula is taken as the definition of compact sets. I ll prove this result only in one direction, assuming that K is a compact set. Proof: First, we enclose the compact K by a brick I, and then will follow the standard dissection procedure, as in the one-dimensional case. The other direction probably will not be used later, so I leave it as and exercise. Exercise If for any open covering of the set K R n there exists a finite subcovering, then K is a compact set. Exercise Show that the result fails for bounded (but non-closed) sets, and for closed (but unbounded) sets. It also fails for coverings of compact set by non-open sets. 5

6 2 Continuous mappings. Curves in R n We continue with prerequisites. 2.1 Continuous mappings Let X R n, and x be an accumulation point of X. Let f : X R m. If there is a R m such that f(x) a when x x, x X, then we say that f has a limit a when x x along X, and write lim f(x) = a. x x, x X Usually, we assume that x intx, then there is no need to indicate that x x along X. It is important to keep in mind that existence of such a limit, generally speaking, does not imply existence of iterated limits and vice versa (see the exercise in the very end of this subsection). The mapping f : X R m is continuous at x, if f(x) f(x ) for x x along X. It is always possible to check continuity using the coordinate functions. Fix a basis {e 1,... e m } in R m, and consider the coordinate functions f j (x), 1 j m (that is, f(x) = j f j(x)e j ). It is easy to see that the function f is continuous at x iff all the functions f j are continuous at x. Exercise 2.1. Write down a formal proof. We say that f is continuous on X if it is continuous at every point of X. By C(X) we denote the class of all continuous functions on X. Exercise 2.2. Prove or disprove: let f : R 2 R 1 be a mapping with the following properties: for each y R, the function x f(x, y) is continuous on R, and for each x R, the function y f(x, y) is continuous on R. Then f is continuous on R 2. If X is a compact set, then continuous mappings defined on X enjoy many properties of continuous functions defined on closed segments. Exercise 2.3. Let f : K R m be a continuous mapping on a compact set K R n. Prove: (i) f is uniformly continuous, that is ɛ > δ > such that x, y K, x y < δ = f(x) f(y) < ɛ. 6

7 (ii) f is bounded on K, i.e. M such that f(x) M for all x K; (iii) if m = 1 (that is, f is a scalar function), f attains its maximal and minimal values on K. Exercise 2.4. A subset K R n is compact if and only if any continuous function map f : K R is bounded on K. Exercise 2.5. If f is a continuous mapping, and K is a compact set, then its image fk is compact as well. If the set V is open, then its preimage f 1 V is also open. Exercise 2.6. Check: (i) for the function f(x, y) = { xy, x 2 +y 2 (x, y) (, ),, (x, y) = (, ), the iterated limits exist and equal to each other but the limit does not exist; (ii) for the function the limits lim lim x y f(x, y) = f(x, y) = lim lim f(x, y) = y x lim f(x, y) (x,y) (,) { x + y sin 1, x (x, y) (, ),, (x, y) = (, ), lim f(x, y) and lim (x,y) (,) lim x y exist and equal zero, but the second iterated limit does not exist; (iii) the function f(x, y) = lim lim f(x, y) y x { x 2 y, x 4 +y 2 (x, y) (, ),, (x, y) = (, ), 7 f(x, y)

8 has the following property: for each a and b in R 2 lim f(ta, tb) =, t but the limit does not exist. lim f(x, y) (x,y) (,) Exercise 2.7. Let E R n be a closed set, f : E R m a continuous function. Show that its graph is a closed subset of R n+m. Γ f def = {(x, f(x)): x E} Exercise 2.8. Show that there is a mapping f from the unit ball B R n onto the whole R n such that f and f 1 are continuous and f is one-to-one. (Such maps are called homeomorphisms) Linear mappings We denote by L(R n, R m ) the space of all linear mappings ( = transformations = operators) from R n to R m. Since we can add mappings: (A + B)x def = Ax + Bx, this is a linear space. This space can be identified with R mn = R } m {{... R m }. For identification, we use the matrix representation which we n times recall. Let A L(R n, R m ). Fix bases: {e j } R n, and {e k } Rm. Then Ae j = m a k j e k. k=1 The matrix of A consists of n columns of height m, the j-th column consists of the coordinates of the vector Ae j in the basis {e k }: a 11 a a 1n a 21 a a 2n Mat(A) = a m1 a m a mn Exercise 2.9. What happens with the matrix of A under the change of the bases in R n and R m? 8

9 By Mat R (m n) we denote the linear space of m n matrices with real entries. Thus we get three isomorphic linear spaces: L(R n, R m ) Mat R (m n) R mn. We can also multiply elements from L(R n, R m ) Mat R (m n) by elements from L(R m, R p ) Mat R (p m) taking the composition of linear mappings, or, what is the same, the product of the correspondent matrices. If m = 1, then the linear mapping L: R n R 1 is called a linear functional on R n. The space of linear functionals is an n-dimensional space called the dual space. Usually, it is denoted by (R n ). The important representation theorem from the Linear Algebra says that if R n has an Euclidean structure, then for any linear functions L (R n ) there exists a vector l R n such that Lx = (l, x) for any x R n. Exercise 2.1. Every linear map L: R n R m is continuous. Hint: first, prove this for linear functionals (i.e., scalar linear functions) on R n : if x = x j e j, then L(x) = L(e j )x j, and L(x) L(y) = L(e j )(x j y j ). The rest is clear. Since the unit ball is a compact subset of R n, as a corollary, we obtain that, for any linear mapping L, the quantity N(L) def = sup Lx x 1 is finite, and actually the maximum on the RHS is attained somewhere on the unit sphere. Actually, N(L) can be defined as the best possible constant in the estimate Lx N(L) x ; i.e. N(L) = Lx sup. x R n \{} x This quantity is called the operator norm of the mapping L, and is denoted by L. Exercise Check that this is a norm, i.e., L = iff L =, and L 1 + L 2 L 1 + L 2. Check that L M L M. It is easy to show that (2.12) L lj,k 2, where (l j,k ) are matrix elements of L. 9 j,k

10 Exercise Prove (2.12). Hint: set y = Lx, using the Cauchy-Schwarz inequality, estimate first yk 2 and then m k=1 y2 k. Here y k = j l k jx j. If m > 1, estimate (2.12) is not sharp. Later, using Lagrange multipliers, we ll give a sharp expression for L. (Check, maybe, you already know it from the Linear Algebra course?) If m = 1, that is L (R n ) is a linear function, there is a vector b such that Lx = b j x j for all x (simply b j = L(e j )). In this case, L = b. (Check this!) The expression j,k l2 j,k we met above is called the Hilbert-Schmidt norm of the operator L and denoted by L HS. There is a more natural definition of the Hilbert-Schmidt norm: Exercise Show that L HS = trace(l L), and that. HS does not depend on the choice of the bases in R n and R m. Exercise Show that L HS n L. Hint: Le j L for each j, 1 j n Continuity of norms Another important class of continuous functions is given by norms, i.e. functions. : R n R + satisfying conditions (a) (c) from Lecture 1. It will be convenient for us to prove simultaneously the next two claims. Claim Any norm. in the Euclidean space R n is equivalent to the Euclidean one: there are positive constants c and C depending on the norm such that for any x R n c x x C x. Claim Any norm is a continuous function on R n. Proof: First, we check the second inequality in Claim Let {e i } be the standard orthonormal basis in R n. Then writing x = x i e i, we get x n x i e i i=1 i x 2 i } {{ } = x e i 2 = C x. i } {{ } =C 1

11 Now, we check continuity of the norm. : x y = y + (x y) y x y C x y, and by symmetry y x C x y. To get the first inequality in Claim 2.16, observe that the function x x attains its minimal value on the unit (Euclidean) sphere, and this value must be positive (why?). Let us denote it by c. Then, writing x = x x, x = 1, we get x = x x c x, completing the proofs Norms and symmetric compact convex bodies There is an intimate relation between the norms in R n and a special class of compact convex bodies in R n. The closed unit ball with respect to any norm in R n is defined as (2.18) K = {x: x 1} A straightforward inspection shows that K always has the following four properties: (i) K is compact; (ii) K is convex, that is, if a, b K, then the whole segment with the end-points at a and b, {ta + (1 t)b: t 1}, belongs to K; (iii) K is symmetric about the origin, i.e. if x K, then x K as well; (iv) K contains a Euclidean neighbourhood of the origin. Exercise Check the properties (i) (iv). Draw the unit ball for the l p norms in R 2, for 1 p. If the norm is generated by a scalar product, how the body K looks like? Assume that we know the compact convex body K. How to recover the norm.? Given x, observe that tx K for t 1, and tx / K for x t > 1. Thus x x 1 = max{t: tx K}. Problem 2.2. Let K be a set with properties (i) (iv). For every x, let x def = Then. is a norm, and (2.18) holds. 1 max{t: tx K}. 11

12 2.2 Continuous curves in R n Definition (Continuous) curve in R n is a continuous mapping γ : I R n, where I R 1 is an interval. If the interval I is closed, I = [a, b], then γ(a) and γ(b) are end-points of the curve γ. The curve γ is closed if γ(a) = γ(b). The curve γ is simple if the function γ (a,b) is one-to-one. Intervals have a natural orientation which induces orientation on curves. Each curve γ is oriented. We can always change the orientation defining the inverse curve γ. For example, if γ is defined on [, 1], then ( γ)(t) def = γ(1 t). The curves γ 1 : I 1 R n and γ 2 : I 2 R n are called equivalent, if there exists a homeomorphism (i.e., a continuous bijection) ϕ: I 2 I 1 which preserves the orientation of the intervals, and such that γ 2 (s) = γ 1 (ϕ(s)). Normally, we identify equivalent curves. Examples: 1. The segment with end-points at x and y: γ(t) = tx + (1 t)y, t 1. The segment with the inverse orientation is ( γ)(t) = ty + (1 t)x. 2. The circle with the natural (counter clock-wise) orientation γ(t) = (cos t, sin t), t 2π. The circle with the opposite orientation ( γ)(t) = (cos t, sin t). This is also the circle γ(t) = (cos 1t, sin 1t), but run 1 times. 3. Archimedus spiral γ(t) = (t cos t, t sin t), t 2π. Exercise Draw the images (with orientation) of the following curves given in the polar coordinates: r = 1 cos 2t ( t 2π), r 2 = 4 cos t ( t π/2), r = 2 sin 3t ( t π). Draw the image (with orientation) of the curve in R 3 defined as γ(t) = (cos t, sin t, t), < t < Peano curve In 189, Peano discovered a remarkable example of a (continuous) curve filling the whole unit square in R 2. The following construction is taken from the book by Hairer and Wanner. Start with an arbitrary curve γ(t) = (x(t), y(t)), t 1 which lies in the unit square and has the end-points γ() = (, ), γ(1) = (1, ). Now, applying rotation and rescaling, we define a new curve 1 (y(4t), x(4t)), if t 1; (Φγ)(t) = (x(4t 1), 1 + y(4t 1)), if 1 t 2; (1 + x(4t 2), 1 + y(4t 2)), if 2 t 3; (2 y(4t 3), 1 x(4t 3)), if 3 t

13 This curve has the same end-points as γ. Now, we iterate the procedure defining the curves γ 1 = Φγ, γ 2 = Φγ 1, and so on. We need to show that the iterations converge. For a mapping λ: [, 1] R n, we set λ = max t [,1] λ(t). This is again the norm, but this time defined on continuous mappings from [, 1] to R 2. Usually, it is called the uniform norm. Observe, that if we start iterate another curve µ, with γ µ = M, then Φγ Φµ M/2, and hence (2.23) γ k µ k M 2 k. 13

14 Putting here µ = γ m, we get γ k γ k+m M 2 k. Now, applying Cauchy s criterion, we see that the sequence of curves γ k converges uniformly, and therefore has a continuous limit γ. The limiting curve γ is independent of the initial curve γ (look at (2.23), and fills the whole unit square. Indeed, the set γ([, 1]) is a compact (why?) and dense (why?) subset of the unit square, hence, it coincides with the unit square. Problem Show that the coordinates of the limiting curve x (t) and y (t) are continuous nowhere differentiable functions on [, 1]. First examples of such functions were constructed by Weierstrass. Problem Show that there is no one-to-one continuous mapping from the interval [, 1] onto the unit square. 2.3 Arc-wise connected sets in R n Definition The set X R n is called (arc-wise) connected if for each pair of points x, y X there is a curve γ : [a, b] X such that γ(a) = x and γ(b) = y. Examples: R 1 \ {}, R 2 \ {x 1 = }, S n, R 2 \ {}, R 3 \ {x 1 = }. The first two sets are disconnected, the others are arc-wise connected. The next four claims are very useful and have straightforward proofs: Claim 2.27 (continuous images of connected sets are connected). Let A R n be a connected set, and f : A R m be a continuous function. Then the image B = f(a) is also connected. Claim 2.28 (mean-value property). Suppose A is an arc-wise connected set, and f : A R is a continuous function. If inf A f < and sup A f >, then there exists a point x A such that f(x) =. Exercise Prove these claims. An open connected subset of R n is called domain ( = region). Exercise 2.3. Any open set U R n can be decomposed into at most countable union of disjoint domains. 14

15 Claim 2.31 (polygonal connectivity). Each domain Ω in R n is polygonalconnected. That is, for any two points a, b Ω there exists a polygonal line (a curve which consists of finitely many segments) which starts at a and terminates at b. An equivalent statement is that for any two points a, b Ω can be connected within Ω by a finite chain of open balls B, B 1,..., B N : B is centered at a, B N is centered at b, B i B i+1, i N 1, and B i Ω, i N. The proof follows from the Heine-Borel lemma. A function f on an open set X is called locally constant if for any x X there is a neighbourhood U of x such that f is constant on U. Claim An open set X R n function is a constant. is connected iff any locally constant Exercise Check that any curve γ starting at x, x < 1, and terminating at y, y > 1, intersects the unit sphere. Exercise f is a continuous function on the unit sphere S 2 R 3. Prove that there is a point x S 2 such that f(x) = f( x). 15

16 3 Differentiation 3.1 Derivative In this lecture, U is always an open subset of R n, and a U. Definition 3.1. The mapping f : U R m is differentiable at the point a if in a neighbourhood of a it can be well approximated by a linear mapping L L(R n, R m ): (3.2) f(x) = f(a) + L(x a) + o( x a ), x a. It is important that the mapping L does not depend on the direction in which x approach a in (3.2). If such a map L exists, then it is unique. Indeed, set x = a + th, where h = 1 and t >. Then and we can recover L: f(a + th) = f(a) + tlh + o(t), t, f(a + th) f(a) Lh = lim, h = 1. t t The linear map L is called the derivative (or, sometimes, the differential) of f at a. There are several customary notations for the derivative: f (a), df(a), D f (a). Usually, we shall try to stick with the latter one. At this point we need to slightly revise the one-dimensional definition. In Hedva 1, we learn that if f is a real-valued function of one real variable, then its derivative f (a) is a real number. From now on, we should think about it as of a linear mapping from R 1 to R 1 defined as multiplication by f (a). If f is differentiable everywhere in U, the derivative is a map D f : U L(R n, R m ). By C 1 (U) (or sometimes, by C 1 (U, R n )) we denote the class of mappings f such that the map D f is a continuous one. Now, several simple properties of the derivative. 1. If f is a constant map, then D f = everywhere in U. If U is a domain and D f = everywhere in U, then f is a constant mapping. 2. If f L(R n, R m ), then f is differentiable everywhere and D f = f. In the opposite direction, if U is a domain, and f : U R m has a constant derivative in U, then there exist L L(R n, R m ) and b R m such that f(x) = Lx + b. 3. D f+g = D f + D g. 16

17 Exercise 3.3. If the mapping f is differentiable at a, then f is continuous at a. Exercise 3.4. If f C 1 (U), and K U is a compact subset, then f K is a Lipschitz function; i.e. there exists a constant M such that for every x, y K f(x) f(y) M x y. Theorem 3.5 (The Chain Rule). Let f : U R m, f(u) V R m, let g : V R k, and let h = g f. If f is differentiable at a, and g is differentiable at b = f(a), then h is differentiable at a, and D h (a) = D g (b) D f (a). Proof: Set A = D f (a), B = D g (b). We need to check that We have r(x) def = h(x) h(a) B A(x a)?? = o( x a ), x a. u(x) def = f(x) f(a) A(x a) = o( x a ), x a, v(y) def = g(y) g(b) B(y b) = o( y b ), y b. Therefore, r(x) can be written as r(x) = g(f(x)) g(b) B(f(x) b) + B(f(x) f(a) A(x a)). }{{}}{{} v(f(x)) Bu(x) We estimate the terms on the RHS. For an arbitrary small positive ɛ, we choose η > such that v(y) ɛ y b, y b η. Then we choose δ > so small that, for x a δ, With this choice, we have f(x) f(a) η, and u(x) ɛ x a. v(f(x)) ɛ f(x) b = ɛ A(x a) + u(x) ɛ A x a + ɛ 2 x a, and Bu(x) B u(x) ɛ B x a. 17

18 Putting together, this gives us Done! r(x) ( ɛ A + ɛ 2 + ɛ B ) x a = o( x a ). As a special case, we get that if f : U R m is differentiable, T L(R p, R n ), and Q L(R m, R k ), then the mappings f T and Q f are differentiable, and D f T = D f T, D Q f = Q D f. In particular, we see that if f j = (f, e j ), 1 j m are coordinate functions of the mapping f, then f is differentiable iff all the functions f j are differentiable. (Prove!) Exercise 3.6. Let f j : R R be differentiable functions, 1 j n, and f(x) = j f j(x j ) (x j are the coordinates of x). Then f is differentiable and Df(x)h = f i(x i )h i. Exercise 3.7. Show that the function f(x) = x, x R n, is differentiable on R n \ {}, and find its derivative. Differentiate the function x x y 2, x R n. Exercise 3.8. Suppose f : R n R is a non-constant homogeneous C 1 - function; i.e. f(tx) = t k f(x) for all x R n, t >. Prove: (i) k 1, (ii) (Euler s identity). n i=1 x i f x i = kf Exercise 3.9. If f and g are differentiable mappings, and ϕ = f, g, then D ϕ h = D f h, g + f, D g h. Now, we shall turn to the derivatives of scalar functions of several variables. 3.2 The gradient Now, f : U R, and the derivative D f (a) is a linear functional on R n. Therefore, there exists the vector, denoted f(a), such that (3.1) D f (a)h = ( f(a), h), h R n. 18

19 This vector called the gradient of the function f at the point a. For h = 1, the expression (3.1) is called the derivative of f in direction h. We have f(a + th) = f(a) + t( f(a), h) This yields the following geometric properties of the gradient: f increases in the directions h where ( f(a), h) >, and decreases in the directions h where ( f(a), h) <. Direction of the fastest increase is h =, direction of the steepest descent is the opposite one: f(a) f(a) h = f(a). The rate of increase (decay) in f is measured by the f(a) length f(a). If f has a local extremum at a, then f(a) =. The points where the gradient vanish are called the critical points of the function f. f(a) is orthogonal to the level set S = {x: f(x) = f(a)}. Let us comment the last claim. Consider a differentiable curve γ : I S, I = ( c, c), γ() = a. Then f(γ(t)) = f(a), and = d dt f(γ(t)) t= = ( f(a), γ ()). That is, the vectors f(a) and γ () are orthogonal to each other 5. Question Whether the gradient depends on the choice of the inner product in R n? Justify your answer. Exercise 3.12 (Rolle theorem in R n ). Let U R n be a bounded domain, f : U R 1 be a continuous function differentiable on U, and vanishing on the boundary U. Show that there exists x U such that D f (x) =. 3.3 The partial derivatives Choose the orthogonal coordinates in R n, i.e. {e 1,... e n } in R n. Then fix an orthonormal basis D f (a)h = D f (a) i h i e i = i h i D f (a)e i. 5 The vector γ () is called the tangent vector to S at the point a. 19

20 The real numbers D f (a)e i = ( f(a), e i ) are called the partial derivatives of f at a, and are denoted by f x i (a), f xi (a), or by i f(a). That is, f(a) = 1 f(a)... n f(a). Equivalently, the partial derivatives can be defined as f(a + te i ) f(a) i f(a) = lim. t t Probably, you ve started with this definition in the course Hedva 2. Existence of partial derivatives does not imply differentiability of f. Look at the function { x 2 y, (x, y) (, ), x f(x, y) = 2 +y 2 (x, y) = (, ). It s partial derivatives exist everywhere in the plane and vanish at the origin (f(x, ) = f(, y) = = x f(, ) = y (, ) = ). Thus, if f would be differentiable, its derivative at the origin must be the zero linear functional. Then f(x, y) f(, ) ( ) ( x y) ( ) x y Substituting x = y = t, we get ( ) = t 2 t (2t 2 ) 3/2 = 1 2 3/2. x 2 y (x 2 + y 2 ) 3/2. Thus, f is not differentiable at the origin. Existence of partial derivatives does not yield even continuity, as the function { x 2 y (x, y) (, ), x f(x, y) = 4 +y 2 (x, y) = (, ) shows. This function has partial derivatives everywhere in the plane but is discontinuous at the origin. Exercise Fill the details. If the partial derivatives are continuous, then the function must be continuously differentiable: 2

21 Theorem TFAE: (i) f C 1 (U); (ii) Everywhere in U, there exist continuous partial derivatives i f. Proof: (i) = (ii): i f(a) i f(b) = (D f (a) D f (b)) e i D f (a) D f (b), 1 i n. (ii) = (i). First, we prove that f is differentiable at U. To simplify notations, we assume that we deal with the function of two variables. Then f(a 1 + h 1, a 2 + h 2 ) f(a 1, a 2 ) = f(a 1 + h 1, a 2 + h 2 ) f(a 1, a 2 + h 2 ) + f(a 1, a 2 + h 2 ) f(a 1, a 2 ) = f x1 (a 1 + θ 1 h 1, a 2 + h 2 )h 1 + f x2 (a 1, a 2 + θ 2 h 2 )h 2. In the last line, we used the mean value property for differentiable functions of one variable t f(t, a 2 + h 2 ) and t f(a 1, t). Using the continuity of the partial derivatives, we get f(a 1 + h 1, a 2 + h 2 ) f(a 1, a 2 ) = (f x1 (a 1, a 2 ) + o(1)) h 1 + (f x2 (a 1, a 2 ) + o(1)) h 2 = f x1 (a 1, a 2 )h 1 + f x2 (a 1, a 2 )h 2 + o( h h 2 2). That is, the function f is differentiable at the point a = (a 1, a 2 ). It remains to check the continuity of the derivative D f. It is obvious, since D f (a) D f (b) = n ( i f(a) i f(b)) 2. Done! Corollary f C 1 (U, R m ) iff all partial derivatives are continuous: f j x i C 1 (U), 1 i n, 1 j m. i=1 The matrix consisting of partial derivatives 1 f 1... n f f m... n f m is called the Jacobi matrix of the mapping f. In the case m = n, the determinant of this matrix is called the Jacobian of the mapping f. Both will play a very important role in this course. 21

22 Exercise Let f(x, y) be a differentiable function, and g(r, θ) = f(r cos θ, r sin θ). Show that ( ) 2 ( ) 2 ( ) 2 f f g + = + 1 ( ) 2 g. x y r r 2 θ Exercise Let Mat n (R) be the linear space of n n matrices with real entries, and det: Mat n (R) R be the determinant. (i) The function det is continuously differentiable on Mat(R). For each A the derivative at A, i.e. (Ddet)(A), is a linear functional on n n matrices. (ii) if I is the unit matrix, then (Ddet)(I)H = tr(h); (iii) if the matrix A is invertible, then (Ddet)(A)H = (deta) 1 tr(a 1 H); (iv) in the general case, (Ddet)(A)H = tr(a H), where A is the complementary matrix to A, that is, A i,j equals ( 1)i j times the determinant of the (n 1) (n 1) matrix obtained from A by deleting the i-th row and the j-th column. 3.4 The mean-value theorem Theorem Let f : U R be a differentiable function, and let [a, b] U be a closed segment. Then there exists ξ (a, b) such that f(b) f(a) = ( f(ξ), b a) = D f (ξ)(b a). Proof: Define the function ϕ(t) def = f(a + t(b a)), t 1, and apply the one-dimensional mean-value theorem. Corollary In the assumptions of the previous theorem, f(b) f(a) sup D f (ξ) b a. ξ (a,b) Exercise Suppose U R n is an open convex set, f : U R 1 is a differentiable function, such that D f M everywhere in U. Then, for any a, b U, f(b) f(a) M b a. 2. Construct a non-convex domain U R 2 and a function f C 1 (U) such that D f 1 everywhere in U, but f(b) f(a) > b a for some a, b U. Exercise Let U R 2 be a convex domain. Let f be a C 1 -function in U. If 1 f everywhere in U, then f does not depend on x Whether the result from item 1. persists if U is an arbitrary domain in R n? (Prove or disprove by a counterexample). 22

23 3.5 Derivatives of high orders In this course, we shall use the high order derivatives only occasionally. So here, we restrict ourselves by few comments. Partial derivatives of higher orders are defined recursively; e.g. ij 2 = ( ) x i x j etc. A very important fact, which you certainly know from Hedva 2 says that under certain assumptions it does not matter in which order to take the partial derivatives. Theorem If the mixed derivatives 2 ijf, 2 jif of a scalar function f exist and continuous at a, then 2 ijf(a) = 2 jif(a). If you are not sure that you remember this result, I strongly suggest to look at any analysis textbook. The next six exercises also pertain to Hedva-2, rather than to Hedva-3. Exercise Suppose = n j=1 2 j (this is the second order differential operator in R n called Laplacian). (i) Check that (f g) = f g + 2( f, g) + g f. (ii) Compute Laplacian of the functions f(x, y) = log(x 2 +y 2 ), (x, y) R 2 \, and f(x) = x n+2, x R n \ {}. (iii) Let R θ be the (counterclockwise) rotation of the plane by angle θ, and F θ : f f R θ be the composition operator. Check that the operators and F θ commute, i.e., F θ = F θ. Exercise The function { 1 /4t f(x, t) = t e x2 t > t, (x, t) (, ) is infinitely differentiable in R 2 \{(, )} and satisfies therein the heat equation f xx = f t. Exercise 3.25 (combinatorics). How many m-th order partial derivatives has an infinitely differentiable function of 3 variables? of n variables? Exercise Let f be an infinitely differentiable function on R 3, and ϕ(t) = f(t, t 2, t 3 ). Find the first 4 terms of the Taylor expansion of ϕ at t =. 23

24 Exercise Show that the point (, ) is a local extremum of the function 1 (1 x)2 + (1 y) (1 + x)2 + (1 y) 2 1 (1 x)2 + (1 + y) (1 + x)2 + (1 + y). 2 Find out whether this is a local minimum or local maximum. Exercise Suppose f is a C 2 -function such that f ( is the Laplacian). Prove that f does not have strong local maxima Hint: start with a stronger assumption f >. The second derivative Df 2 of a scalar function f is the derivative of the map D f : U L(R n, R 1 ), that is Df 2 is an element of the space L(Rn, L(R n, R 1 )). If Df 2 exists and continuous in U, then we say that f is twice continuously differentiable in U and write f C 2 (U). Elements of the linear space L(R n, L(R n, R 1 )) can be identified with (R 1 -valued) bilinear forms; i.e. with functions ϕ: R n R n R 1 linear with respect to each argument: if Φ L(R n, L(R n, R 1 )), then ϕ(h 1, h 2 ) = (Φh 1 ) h 2. Therefore, the second derivative Df 2 can be viewed as a symmetric (R1 -valued) bilinear form. Similarly, if f : U R m, then Df 2 can be regarded as a symmetric Rm - valued bilinear form. The higher derivatives are defined by recursion with respect to the order k. If the function f has continuous derivatives of all order k in U, then we say that is C k -smooth. Problem Let f : U R m be a C k -smooth mapping. Identify the k- th derivative Df k with a symmetric k-linear Rm -valued form; i.e. a mapping R } n R n {{... R n } R m which is linear with respect to each argument k times (when the others are fixed) and symmetric. Hint: use induction with respect to k. 24

25 4 Inverse Function Theorem In this lecture, we address the following problem. Let f be a C 1 -mapping. Consider equation f(x) = y. When it can be (locally) inverted? what are the properties of the inverse mapping f 1? We expect that locally f behaves similarly to its derivative D f ; i.e. if D f (a) is invertible, then f maps in a one-to-one way a neighbourhood of a onto a neighbourhood of b = f(a), that the inverse map is differentiable at b, and D f 1(b) D f (a) = I (the identity map). The main result of this lecture (which is the first serious theorem in this course!) confirms that prediction. However, first, we will prove the following simple Claim 4.1. Let U R n be an open set, f : U R m be a differentiable function with a differentiable inverse. Then m = n. Proof: We have x = f 1 f(x), and the chain rule is applicable: D f 1(f(x)) D f (x) = I n (identity map). Recall that D f L(R n, R m ). Then a result from the Linear Algebra course insists that n m. (Recall the result!). By symmetry, m n. Thus m = n, completing the proof. 4.1 The Theorem Our standing assumptions are: U R n is a domain, a U, f : U R n is a C 1 -mapping. Theorem 4.2. Suppose that the linear map D f (a) is invertible. Then 1. there are a neighbourhood U = U a of a and a neighbourhood V = V b of b = f(a) such that f U is a one-to-one mapping, and f(u) = V ; 2. the inverse map g : V U is a C 1 -mapping, and D g (b) D f (a) = I n. Examples: 1. (n = 1) The function of one variable { t + 2t 2 sin(1/t) t, f(t) = t = is differentiable, and the derivative f (t) is bounded everywhere. However, this function is not one-to-one on any neighbourhood of the origin. Here, the C 1 -assumption is violated. 25

26 2. (n = 2) The mapping f defined as { x 1 = e x cos y, y 1 = e x sin y meets conditions of the Inverse Function Theorem. f maps R 2 onto R 2 \ {}, but for any w R 2 \{}, the equation f(z) = w has infinitely many solutions. This shows that the IFT can be applied only locally. 4.2 Continuity of the inversion of linear operators We shall prove the linear algebra claim that we need in the proof of the IFT. By GL n we denote the group of all invertible linear mappings from L(R n, R n ). Claim Suppose A GL n, and B L(R n, R n ) is such that B A < 1 A 1, then B GL n as well. 2. The mapping A A 1 is continuous in the operator norm. The second assertion says that if a sequence of operators B k converges to A GL n in the operator norm, then according to the item 1, B k GL n for sufficiently large k, and B 1 k also converges to A 1. Proof of the claim: 1. Set Then α = 1 A 1, β = B A. Bx Ax (B A)x Ax β x. Since, x = A 1 Ax α 1 Ax, we get Ax α x, and then Bx (α β) x > for all x R n \ {}. Thus, B is invertible. 2. Start with the identity B 1 A 1 = B 1 (A B)A 1, then B 1 A 1 B 1 A B A 1. 26

27 We already know from the first part that Bx (α β) x, or that is, and finally (α β) B 1 y BB 1 y = y, B 1 1 α β, B 1 A 1 β α(α β). Therefore, B 1 A 1 can be made arbitrary small, if β = B A is small enough (and α is fixed). 4.3 Proof of the IFT The proof will be split into several parts. Set A = D f (a), λ = 1 4 A 1, and choose a sufficiently small ball B centered at a such that sup D f (x) A λ. x B (Why this is possible?) This choice guarantees that D f (x) is invertible everywhere in B The map f is one-to-one in B. Let x, x+h B. We shall estimate from below the distance f(x+h) f(x). We need the following Claim 4.4. f(x + h) f(x) Ah 1 2 Ah. Proof: introduce the R n -valued function of one variable Then ϕ (t) = D f (x + th)h Ah, and ϕ(t) def = f(x + th) tah. ϕ (t) D f (x + th) A h λ h = A 1 h 1 4 Ah.

28 Therefore, ϕ(1) ϕ() = proving the claim. Now, 1 1 ϕ (t) dt ϕ (t) dt 1 4 Ah, f(x + h) f(x) Ah f(x + h) f(x) Ah Ah that is, f is one-to-one on the ball B. We shall record the estimate 1 h = 2λ h, 2 A 1 (4.5) f(x + h) f(x) 2λ h for future references Surjectivity We shall show that f(b) is an open set. This will be the neighbourhood V of b, where the inverse map f 1 exists. Take y f(b), y = f(x ) (x B), and take a ball B = B(x, r) such that B B. Claim 4.6. B(y, λr) f(b ). Of course, this claim yields that the image f(b) is an open set. Proof of the claim: Take any y such that y y < λr. We are looking for the point x B such that y = f(x ). It is natural to try to find x by minimizing the function ϕ(x) = y f(x) 2 over the closed ball B. First of all, show that the minimum is not attained on the boundary sphere. Observe that ϕ(x ) = y f(x ) 2 < (λr) 2 (due to the choice of y). On the other hand, if x S = B, then ϕ(x) = f(x) y > f(x) y + f(x ) y λr f(x) f(x ) λr (4.5) 2λ x x λr = 2λr λr = λr, whence ϕ(x) > (λr) 2. Thus, ϕ does not achieve its minimum on the boundary of B. 28

29 Let x B be the minimum point of ϕ, then D ϕ (x ) =. Differentiating ϕ(x) = (y f(x), y f(x)), we get and we arrive at the equation D ϕ (x)h = 2 D f (x)h, y f(x), D f (x )h, y f(x ) =, for any h R n. Since the linear map D f (x) is invertible everywhere on B, and in particular at x, the range of D f (x ) coincides with R n ; i.e., {D f (x )h} h R n = R n. Since y f(x ) belongs to the orthogonal complement to this set, we get y f(x ) =, that is y = f(x ), proving Claim Continuous differentiability of the inverse map Let g = f 1. It remains to show, that g C 1 (V ), where V = f(b), and that D g = D 1 f. As in the one-dimensional case, first, we shall see that g is a continuous map, then we check that g is differentiable and D g = D 1 f, and then that the derivative D g is continuous on V. Let y, y + k V = f(b). We need to estimate the norm of h = g(y + k) g(y). Set x = g(y). Then and by estimate (4.5) f(x + h) f(x) = f(g(y + k)) y = y + k y = k, k = f(x + h) f(x) 2λ h = 2λ g(y + k) g(y). This gives us continuity of g. Now, let L = D 1 f. Since we get Lk = h + Lr x (h), or k = f(x + h) f(x) = D f (x)h + r x (h), g(y + k) g(y) Lk = h Lk = Lr x (h). We shall estimate the norm of this expression. Since f is differentiable, r x (h) ɛ h, provided that k (and therefore h ) is sufficiently small. Hence Lr x (h) L r x (h) ɛ L h (4.5) ɛ L k 2λ. 29

30 This shows that Lr x (h) lim k k =. Hence, g is differentiable at y, and its derivative is D g (y) = L = D f (x) 1. It remains to check the continuity of the map y D g (y). Let us look again at the formula we ve obtained: D g (y) = (D f (g(y))) 1. The RHS is a composition of three mappings: y g(y), a D f (a), and L L 1. The first two are continuous. The third map L L 1 is also continuous (item 2. of Claim 4.3). This does the job and finishes off the long proof of the Inverse Function Theorem. Exercise 4.7. The mapping f maps the coefficients a 1, a 2, a 3 of the equation x 3 + a 1 x 2 + a 2 x + a 3 = into its roots x 1 x 2 x 3. Prove that f is a C 1 - mapping defined in a neighbourhood of the point a 1 = 3, a 2 = 2 and a 3 =, and compute Jacobian of f at this point. Exercise 4.8. The mapping f : R n R n maps x 1, x 2,... x n into the coefficients a 1, a 2,... a n of the polynomial (x x 1 )(x x 2 )... (x x n ) = x n + a 1 x n a n. (i) Prove that the rank of D f equals the number of distinct values among x 1,..., x n. (ii) Compute the Jacobian of f. Hint: start with the cases n = 2 and n = 3. Exercise 4.9. Let f be a C 1 -mapping. Show that the rank of D f is an upper semi-continuous function, i.e., rankd f (x) rankd f (x ) in a neighbourhood of the point x. In the next two lectures, we shall show how powerful the Inverse Function Theorem is. We exhibit several of its important consequences: the open mapping theorem; the implicit function theorem; the Lagrange multipliers method. 3

31 5 Open Mapping Theorem and Lagrange Multipliers 5.1 Open Mapping Theorem Definition 5.1. The mapping f : U R m is called open if for any open subset U U the image f(u ) is also open. Definition 5.2. The C 1 -mapping f is regular, if, for all a U, rank D f (a) = m. Recall that, for L L(R n, R m ), rankl = dim(lr n ) (in other words, the maximal number of linear independent columns in the matrix representation of L). Theorem 5.3. Regular mappings are open. We shall prove a local version of this result which says: (5.4) rank D f (a) = m = f(a) intf(u). Of course, the Open Mapping Theorem follows from (5.4). If m = n, then (5.4) is a part of the Inverse Function Theorem. Now, we shall reduce the general case to this special one. First take a linear map T L(R m, R n ) such that D f (a) T is invertible. Such a map T exists. Indeed, since rank D f (a) = m, we have dim Ker D f (a) = n m. Let Y be the orthogonal complement to Ker D f (a) in R n, dim Y = m. As we know from the Linear Algebra course, the mapping D f (a) is one-toone on the subspace Y. Choose any one-to-one linear mapping T : R m Y, then D f (a) T is invertible. Now, consider the function g(x) def = f(a+t x). Its derivative at the origin D g () = D f (a) T is invertible. Thus by the Inverse Function Theorem, there is a neighbourhood V of the point g() = f(a) which belongs to f(u). That is, f(a) lies in the interiour of the image f(u). 5.2 Lagrange Multipliers In the courses Hedva 1 and Hedva 2 you ve learnt how to find extrema of functions of one and several variables. Here, we learn how to find the extremal values of functions when the variables are subjects to additional restrictions. Let U R n be an open set, the functions f, g 1,..., g k are C 1 (scalar) functions defined on U, M = {x U : g 1 (x) =... = g k (x) = }. 31

32 We want to find extremal values of f when x U is subject to additional restrictions given by x M. Such extremal values are called conditional. Theorem 5.5. Let a M be a conditional extremum of f. Suppose that the vectors g 1 (a),..., g k (a) are linearly independent vectors. Then the linear span of these vectors contains the vector f(a). In other words, there exist constants λ 1,..., λ k such that f(a) = k λ k g k (a). j=1 Proof: Define the map H : U R k+1 by f H = g g k This is a C 1 -map, and rank(d H (a)) k (since the vectors g 1 (a),..., g k (a) are linearly independent). Assume that a is a conditional minimum of f: f(x) f(a) x M U a, where U a is a neighbourhood of a. If rank(d H (a)) = k +1, then by the Open Mapping Theorem f(a)... = H(a) int H(U a). But then there exist t < f(a) and x U a such that t H(x) =.... In other words, there is x M such that f(x) = t < a. Contradiction! Hence rank D H (a) = k, and the vector f(a) belongs to the linear span of the vectors g 1 (a),..., g k (a). Now, we bring several applications of the Lagrange multipliers technique. 32

33 5.2.1 Geometrical extremal problems We start with simple geometrical problems taken from the high-school analytic geometry : Find the distance from the origin to the affine hyperplane α i x i = c in R n. Solution: Here, we minimize the function f(x) = x 2 i under condition g(x, y, z) = α i x i c =. In this case, the Lagrange equations are { 2x i = λα i, 1 i n, αi x i = c. We get or c = λ 2 α 2 i, λ = 2c α 2 i. Substituting this value into the first n of the Lagrange equations, we get the coordinates of the point where f attains the conditional maximum: x i = cα i, 1 i n. α 2 i The distance is x 2 i = Exercise 5.6. Find the point on the line { αx + βy + γz the closest to the origin. c α 2 i. = c x + y + z = 1 Exercise 5.7. The closed curve Γ R 3 is defined as the intersection of the ellipsoid 3 x 2 j = 1, and the plane j=1 a 2 j 3 A j x j =. j=1 Find the points on Γ which are the closest and the most distant from the origin. 33

34 Isoperimetry for Euclidean triangles By A we denote the area, and by L the length. The Dido isoperimetric inequality says that, for any plane figure G, A(G) L( G)2, 4π and equality is attained for discs only. For triangles, the estimate can be improved: Theorem 5.8 (Heron). For any plane triangle, (5.9) A( ) L( )2 12 3, and the equality sign attains for the equilateral triangles and only for them. In other words, among all triangles with the given perimeter, the equilateral one has the largest area. Solution: is based on the Heron formula that relates the area A and length of the sides x, y and z: A 2 = L ( ) ( ) ( ) L L L 2 2 x 2 y 2 z. Set L = 2s. Then we need to maximize the function under condition f(x, y, z) = s(s x)(s y)(s z) g(x, y, z) = x + y + z 2s =. Of course, we have additional restrictions x, y, z >, x + y > z, x + z > y, y + z > x, which define the domain U in the space (x, y, z). (Draw this domain!) On the boundary of this domain (when the inequalities turn to the equations), the function f identically vanishes. Thus, f attains its maximal value inside U and we can use the Lagrange multipliers. The Lagrange equations are s(s y)(s z) = λ s(s x)(s z) = λ s(s x)(s y) = λ x + y + z = 2s 34

35 The first three equations give us (s y)(s z) = (s x)(s z) = (s x)(s y), whence x = y = z = 2 3 s, and A 2 = s (s/3) 3. The result follows The linear algebra problems Here, we look at two Linear Algebra problems. Extrema of quadratic forms We are looking for the maximal and minimal values of the symmetric quadratic form n f(x) = a ij x i x j a ij = a ji, on the unit sphere i,j=1 n x 2 i = 1. i=1 In this case, f(x) = (Ax, x), where A is a symmetric linear operator with the matrix coefficients a ij. Thus f(x) = 2Ax. Furthermore, g(x) = i x2 i 1, and g(x) = 2x. Therefore, the Lagrange equations take the form { 2Ax = 2λx (x, x) = 1. Hence, λ is the eigenvalue of A, and the maximum of the form is the largest eigenvalue, the minimum of the form is the smallest eigenvalue. The operator norm As a corollary, we compute the (operator) norm of a linear operator L L(R n, R n ). By definition, L = max Lx. x =1 Thus, we need to maximize the function f(x) = Lx 2 = (Lx, Lx) under additional condition x 2 = 1. Observe that f(x) = (Lx, Lx) = (L Lx, x). Hence, by the previous paragraph, L 2 equals the maximal eigenvalue of the symmetric matrix L L. Exercise 5.1. Let L be an invertible linear operator. Find L 1. 35

36 5.2.3 Inequalities Lagrange multipliers are very useful in proving inequalities. Here are several examples. The Hölder Inequality Let 1 < p <. Then (5.11) { } 1/p { } 1/q x i y i xi p yi q, where q is the dual exponent to p: 1 p + 1 q = 1. Proof: We assume that all x i s and y i s are non-negative. Since Hölder s inequality is homogeneous with respect to multiplication of all x i by the same positive number, we assume that x p i = 1. Given y R n with non-negative coordinates, define the function f(x) = x i y i. That is, for a compact set K = {x R n : x i, x p i = 1}, we want to prove that max K f { y q i }1/q. We use induction with respect to the number n of variables. For n = 1, we have K = {1}, and there is nothing to prove. For an arbitrary n 2, we look at the extremum of f on K. We assume that all y i are positive (otherwise, the actual number of variables is reduced and we can use the induction assumtpion). Note that he Lagrange multipliers technique can be applied only on the set K = {x R n : x i >, x p i = 1}. (why?) However, the rest K \ K consists of x s such that at least one of the coordinates x i vanishes and x p i = 1. Hence, by the assumption of the induction max K\K f < { y q i }1/q. Now, using the Lagrange method, we shall find that the conditional extremum of f under assumptions g(x) = x p i 1 = equals { y q i }1/q. Hence, this is the conditional maximum. This will prove Hölder s inequality (and also will show that in cannot be improved). The Lagrange equations have the form y i = λpx p 1 i, 1 i n, x p i = 1. To simplify notations, set ν = λp. Then x i = (y i /ν) 1 p 1, 1 i n, whence 1 = (y i /ν) p p p p 1, or ν p 1 = y p 1. We get then x j = (y j /ν) 1 1 p 1 = y i p 1 j f(x) = x j y j = y 1+ 1 p 1 j { y p { y p p 1 i 36 p 1 i } 1/p, 1 j n, } 1/p { } 1 1 = y q p j = { } 1 y q q j

37 (recall that p p 1 = q). We proved Hölder s inequality in the case of finitely many variables {x i } and {y i }. Note that it persists in the case of countable many variables x i and y i. In this case, it means that if two series x i p and y i q converge (and q is dual to p), then the series x i y i also converges and inequality (5.11) holds. Exercise Prove that, for x i >, n 1 x x n n x 1 x 2... x n x 1 + x x n n. The equality sign attains only in the case when all x i s are equal. Hint: to get the first inequality, minimize x 1 x 2... x n under assumption that all x i s are positive and i x 1 i = 1. To get the second inequality, maximize x 1 x 2... x n under assumption that all x i s are positive and i x i = 1. Exercise Find the maximum of the function f(x, y, z) = x a y b z c (a, b, c > ), where x, y and z are positive, and x k + y k + z k = 1 (k > ). The following inequality is essentially more involved. It s first proof used Lagrange multipliers, though later Pólya and Carleson found direct proofs: Problem 5.14 (Carleman). Suppose c i >. Then {c 1... c n } 1/n < e c k. n k For those who like inequalities, there is an excellent classical book: Inequalities by Hardy, Littlewood and Pólya. Our Library also has a recent book by Steele The Cauchy-Schwarz master class which looks good. 37

38 6 Implicit Function Theorem 6.1 Curves in the plane Start with a motivation. Assume we have a curve in the plane R 2 defined implicitly by (6.1) f(x, y) =, where f is a smooth function. We want to solve this equation; i.e. to find a smooth function x = g(y) such that (6.2) f(g(y), y), or a smooth function y = h(x) such that f(x, h(x)). After a minute reflection, we come to the conclusion that there is at least one obstacle for this: differentiating identity (6.2), we get or f x g + f y =, g (y) = f y(g(y), y) f x(g(y), y). That is the function g(y) does not exists at the points y such that simultaneously f(x, y) = and f x (x, y) =. Similarly, the function h(x) does not exists at the points x such that simultaneously f(x, y) = and f y (x, y) =. To fix the idea, we will be after the function x = g(y). The simplest example is the unit circle: f(x, y) = x 2 + y 2 1. We see that the function g(y) does not exists at the points (, ±1). Now, assume that we ve excluded these points. Then, there are two solutions x = g ± (y) = ± 1 y 2, and we need to specify which sign to take. We can do this simply by choosing one point (a, b) (a ) which automatically determines the whole branch: After we fixed the point (a, b) we can uniquely determine the smooth function x = g(y) we were looking for. It is defined in a neighbourhood of b and g(b) = a. Another example is the the famous folium cartesii. This is a curve in R 2 defined by the equation a is a positive parameter. f(x, y) = x 3 + y 3 3axy =, 38

39 39

40 The point (, ) is a singular point, since at this point f x = f y = and we cannot resolve the equation f(x, y) =. There are two other points on the folium: (a2 1/3, a2 2/3 ) and (a2 2/3, a2 1/3 ); at the first one f x vanishes, at the second one f y vanishes. So now, we can formulate the two-dimension result that we hope to get 6 : Theorem 6.3 (2D version). Let U R 2 be a domain, and f C 1 (U, R 2 ). Let (a, b) U be such a point that f(a, b) =, and f x(a, b). Then there exists a neighbourhood W of b and a unique function g C 1 (W ) such that g(b) = a, and f(g(y), y) in W. Exercise 6.4. The following equations have unique solutions for y = y(x) near the points indicated: x cos xy =, (1, π/2), and xy + log xy = 1, (1, 1). Prove that in both cases the function y(x) is convex. Exercise 6.5. Find the maximum and the minimum of the function y that satisfies the equation x 2 + xy + y 2 = The theorem Here are our standing assumptions: U R n R m is an open set, we denote points of U by (x, y), x R n, y R m ; f : U R n is a C 1 -mapping; (a, b) U is such a point that f(a, b) =. We consider equation (6.6) f(x, y) =. This is a system of n equations with n + m variables. We want to solve this system; i.e. to express x in terms of y. More formally, we are looking for a neighbourhood W of b and for a C 1 mapping g : W R n such that a = g(b) and f(g(y), y), y W. By f x we denote the derivative of the mapping f with respect to the variable x R n when the other variable y is fixed, f x L(R n, R n ), and by f y the derivative of f with respect to y when x is fixed, f y L(R m, R n ). 6 It is a special case of a general result proven below by applying the inverse function theorem, though this special case can be proven by elementary means you know from Hedva-1 4

41 Before stating and proving the theorem, let us try to guess the result linearizing the problem. We have ( ) x a f(x, y) = f(a, b) + D f (a, b) + remainder y b = f(a, b) + f x(a, b)(x a) + f y(a, b)(y b) + remainder. Discarding the remainder, to get the function x = g(y), we have If x f(a, b) is invertible, then we get x f(a, b)(x a) + y f(a, b)(y b) =. x = a [ x f(a, b)] 1 y f(a, b)(y b). The left hand side is the unique function g we are looking for. Now, the theorem follows: Theorem 6.7. Suppose that the linear operator f x(a, b) is invertible. Then there is a neighbourhood W of b and a unique C 1 -mapping g : W R n such that a = g(b), and (6.8) f(g(y), y), for any y W. Remark The derivative of the mapping g is given by (6.9) g (y) = [f x(g(y), y))] 1 f y(g(y), y) We get this differentiating equations (6.8) by y. Exercise 6.1. If, in assumptions of the implicit function theorem, the function f belongs to C k (U) (k 2), then the function g also belongs to C k (W ). Hint: use induction with respect to k. 6.3 Proof of the implicit function theorem The idea is not difficult. To explain it, consider the simplest case m = n = 1. For each y in a neighbourhood of the point b, we are looking for the unique C 1 -solution x(y) of the equation f(x, y) = such that x(b) = a. Instead, we consider a more general system of two equations { f(x, y) = ξ, y = η, 41

42 with respect to the variables (x, y). The right-hand side belongs to a neighbourhood of the point (, b), the solution should belong to a neighbourhood of the point (a, b). The Inverse Function Theorem says that there exists a unique C 1 -solution x = ϕ(ξ, η), y = ψ(ξ, η) to this system. Clearly, y(ξ, η) = η. Now, recalling that we are interested only in a special case ξ =, we get the C 1 -function x(y): = ϕ(, y) such that f(x(y), y) =, and x(b) =. To make this argument formal, define the mapping ( ) f(x, y) F (x, y) = : U R n+m. y Its derivative at the point (a, b) acts as follows ( ) ( ) h h D F (a, b): D f k. k k In the block-matrix form, D F = ( ) x f y f. I Since x f(a, b) is invertible, the linear operator D F (a, b) is also invertible More formally, if ( ) h D f = k k =, ( ) h then D f =, and by our assumption about the operator x f(a, b), h =. By the Inverse Function Theorem, there are the neighbourhoods (a, b) Ũ Rn+m, (, b) Ṽ Rn+m, and the C 1 -mapping Φ: identity map. We have Ṽ Ũ such that Φ(, b) = (a, b), and F Φ is the Φ(ξ, η) = (ϕ(ξ, η), η), (η, η) Ṽ, where ϕ: Ṽ Rn is a C 1 -function. In other words, f(ϕ(ξ, η), η) ξ, for any (ξ, η) Ṽ. 42

43 Now, define the neighbourhood of b: W = {η : (, η) C 1 -mapping g(y) = ϕ(, y), y W. Then we have Ṽ }, and the f(g(y), y) for y W, and g(b) = ϕ(, b) = a. The mapping g with these properties is unique. Indeed, if f(x, y) = f(x, y) for (x, y), (x, y) Ṽ, then F (x, y) = F (x, y), and since F is oneto-one x = x. This completes the proof of the Implicit Function Theorem. If you feel that this proof is too complicated, I strongly suggest, first, to work out all its details in the case n = m = 1, and only when this special case will be clear to pass to the general case. Exercise Let f : U R 1 be a C 1 -function on an open set U, such that f x j in U for all j = 1, 2,..., n. Then the equation f(x 1,... x n ) = locally defines n functions x 1 (x 2,..., x n ), x 2 (x 1, x 3,..., x n ),..., x n (x 1,... x n 1 ). Find the product x 1 x 2... x n 1 x n. x 2 x 3 x n x 1 43

44 7 Null-sets It is not a simple task to understand what is the volume? and which sets have the volume? In our course, we ll be able to answer this question only partially. It is much simpler to understand which sets have zero volume. These sets play a very important role in analysis and its applications. 7.1 Definition Let Q = I 1 I 2... I n R n be a brick (I k are the intervals, open, semi-open, or closed). Its volume equals v(q) = n I k. Definition 7.1. The set E R n is a null-set if for each ɛ > there exist open bricks Q j such that E j Q j, and j v(q j) < ɛ. The null-sets are also called the zero measure sets, and the negligible sets. By N we denote the set of all null-sets. Here are some useful properties of the null-sets: k=1 1. If E N and F E, then F N. 2. The countable sets are null-sets. Indeed, if E = {a m } m N, then for each m, we find a cube Q m that contains a m and such that v(q m ) < ɛ2 m. 3. The countable union of null-sets is a null-set. The proof is similar: if E = E m, then we cover E m j Q jm in such a way that j v(q jm) < ɛ2 m. 4. If E is a compact set in R n, then for each ɛ > there is a finite covering E j Q j such that j v(q j) < ɛ. Note, that the number of the bricks Q j in the covering of a null-set E can be infinite. For instance, if E = Q [, 1] is a set of all rational points in [, 1], then there is no finite covering with small total volume. (Why?) Here are several exercises which help to get used to the new notion: Exercise 7.2. Prove or disprove: if E N, then the closure of E is also a null-set, Ē N. 44

45 Exercise 7.3. Denote by Q the projection of the set Q R n onto the hyperplane R n 1 ; i.e. Q = {x R n 1 : y R 1 (x, y) Q}. Show that if Q is a null-set, then Q is also a null-set. Is the converse true? Exercise 7.4. Let A R n, and f : A R n be a Lipschitz map; i.e. f(x) f(y) M x y for any x, y A. Show that if E A is a null-set, then fe is a null-set as well. Exercise 7.5. Suppose U R n is an open set, f C 1 (U, R n ). If E U is a null set, then fe is a null set as well. Hint: any open subset of R n can be exhausted from inside by finite unions of closed bricks. The both exercises yield that the image of a null-set under a linear transformation L L(R n, R n ) is again a null-set. We see that the notion of null-set is independent on the choice of the coordinates in R n. 7.2 Examples of null-sets Theorem 7.6. Let Q R n be a closed brick, and f be a continuous function on Q. Then the graph Γ f = {(x, f(x)): x Q} of f is a null set in R n+1. Proof: Fix ɛ > and choose δ > such that x 1 x 2 < δ yields f(x 1 ) f(x 2 ) < ɛ. Let Π be a partition of Q onto closed bricks, the partition is chosen so fine that the diameter of each brick from Π is < δ. For any brick S from Π, the oscillation of f on S is less than ɛ: M f (S) m f (S) < ɛ, where M f (S) = max f, m f (S) = min f. S S The graph Γ f is covered by the n+1-dimensional bricks S = S [m f (S), M f (S)], and v n+1 ( S) = v n (S) (M f (S) m f (S)) < ɛv n (S). Then and we are done. v n+1 ( S) < ɛ S S v n (S) = ɛv n (Q), In the proof, we silently used that for any finite partition Π of the brick Q onto bricks S: v(s) = v(q). Exercise 7.7. Prove this! S Π 45

46 The theorem we proved persists for the graphs of continuous functions defined on open subsets of R n. Exercise 7.8. Prove this! Hint: again the same idea: any open subset of R n can be exhausted from inside by finite unions of closed bricks. Exercise 7.9. Let f : I R 1 be an increasing function, I be an interval. Show that the discontinuity set of f is a null set. Hint: show that for each j N, the set of points x I such that f(x + ) f(x ) 1/j is a null-set. Another important example is provided by Theorem 7.1 (Sard). Let U be an open set in R n and f is a non-constant C 1 -function on U. Let B f = {x U : D f (x) = } be the set of critical points of f, and C f = f(b f ) be the set of critical values of f. Then C f is a null-set. Note that the set B f of critical points does not have to be a null-set. Proof of Sard s theorem in the one-dimensional case: First, we assume that f C 1 (J, R 1 ), where J R 1 is a closed bounded interval. Fix an arbitrary ɛ >. Since f is uniformly continuous on J, we can split J onto small closed sub-intervals I with disjoint unions such that f (x) f (y) ɛ for x, y I. Consider only those sub-intervals I that contain at least one critical point of f. If f (ξ) = for ξ I, then, by the choice of I, max f ɛ, whence I β β max f min f = f(β) f(α) = f f ɛ I. I I α α In other words, the image f(i) is contained in an interval of length ɛ I. Thereby, the critical set C f is covered by a system of intervals of length ɛ I ɛ J. Now, if f is a C 1 -function on an open set U R 1, then we exhaust U by finite unions of closed bounded intervals and apply the special case proved above. Problem Prove Sard s theorem in the multi-dimensional case. An excellent reference to this problem is Milnor s book Topology from the differentiable viewpoint. 46

47 7.3 Cantor-type sets Classical Cantor set We construct a nested sequence of compacts K K 1... K m.... First step: K = [, 1]. Second step: we remove from K the middle third, i.e. K 1 = K \ ( 1 3, 2 3 ). Then K 1 consists of two closed intervals of equal length, and K 1 = 2 3. After the m-th step, we obtain a compact K m which is a union of 2 m equal intervals of total length K m = ( 2 3 )m ; K K 1... K m. On the m + 1-st step, we remove from each of these intervals the middle third, and obtain the compact K m+1. And so on. Then we set K = m K m. This is a compact null-set called the classical Cantor set. Exercise Show that 1. the set Ω = [, 1] \ K is an open set; 2. Ω = K; 3. K has no isolated points. 4. K coincides with the set of points x [, 1] which have no digit 1 in their ternary representation. Exercise Consider the set of points x [, 1] which have no digits 7 in the decimal expansion. Whether this is a null-set? 47

48 7.3.2 Cantor set with variable steps The Cantor construction has various modifications. For example, on each step, we can remove not the middle third of each component but an interval of a different length that varies from step to step. On the m-th step we remove from each closed interval I the middle open interval of the length ɛ m I (in the previous construction, ɛ m = 1 for all m). 3 Then the length of the compact K m we obtain on the m-th step is Now we need to exercise: K m = (1 ɛ m ) K m 1 =... = Exercise Let < ɛ j < 1. Then lim m j=1 m (1 ɛ j ) = m (1 ɛ j ). j=1 if and only if ɛ j = +. j We see that the limiting Cantor-type set K = m K m is a null-set if and only if j ɛ j = +. If ɛ j decay so fast that the series ɛ j converges, then Cantor-type set K is not a null-set. So that, we obtain an open set Ω = [, 1] \ K with a huge boundary Ω = K which is not a null set Cantor-type sets in R 2 If we perform the same construction in the unit square (on each step, one square is replaced by 4 equal smaller sub-squares on the corners of the original square), then the open complement [, 1] 2 \ K is connected, and we obtain a domain in the unit square with a large boundary which is not a null-set. This time, we get a nested sequence of compacts K m in the unit square, each K m consists of 4 m equal closed squares, and the area of K m is Then we set K = m K m. Exercise Check that m (1 ɛ j ) 2. j=1 48

49 1. K is a compact set with empty interiour and without isolated points; 2. K is a null-set if and only if ɛ j = + ; 3. Ω = (, 1) 2 \ K is a domain, Ω K. Thus, we ve constructed a domain in R 2 whose boundary is not a null-set. 49

50 8 Multiple Integrals 8.1 Integration over the bricks Darboux sums Let Q = I 1 I 2... I n be an n-dimensional brick, f : Q R 1 a bounded function on Q. Construction of the Darboux sums is practically the same as in the onedimensional case. Let Π be a partition of the brick Q, S a brick from the partition Π. We set M f (S) = sup f, S m f (S) = inf S f. Then the upper and lower Darboux sums (of the function f with respect to the partition Π) are defined as follows: U(f, Π) = S M f (S)v(S), L(f, Π) = S m f (S)v(S). The proofs of the following properties are the same as in the one-dimensional case: 1. if the partition Π is finer 7, than the partition Π, then L(f, Π) L(f, Π ), U(f, Π ) U(f, Π); 2. for each partitions Π 1 and Π 2, Exercise 8.1. Prove these properties! The fundamental definition L(f, Π 1 ) U(f, Π 2 ). Definition 8.2. The function f : Q R 1 is (Riemann) integrable over Q if f is bounded, and sup L(f, Π) = inf U(f, Π). Π Π The common value for this supremum and infimum is called the integral of f over the cube Q, and denoted by f = f(x) dx. Q 7 This means that each brick from Π is a sub-brick of a brick from Π. Q 5

51 By R(Q) we denote the class of all Riemann integrable functions on Q. As in the one-dimensional case, we get Claim 8.3. The function f is integrable iff for each ɛ > exists the partition Π of Q such that U(f, Π) L(f, Π) < ɛ. Now, we are able to state explicitly which functions are Riemann-integrable Lebesgue Theorem The result is formulated in terms of the size of the discontinuity set of the function f. This set should not be too large. The quantitative way to say that the function f is discontinuous at the point x is to measure its oscillation at x: ( ) osc f (x) = lim sup f(y) lim inf f(y) y x y x =: lim δ sup f(y) lim y x <δ δ Exercise 8.4. The function f is continuous at x iff osc f (x) =. That is, the discontinuity set of f is B f = {x: osc f (x) > }. inf f(y) y x <δ Exercise 8.5 (properties of the discontinuity set). Prove that the sets B f+g, B f g and B max(f,g) are contained in B f B g. Prove that B f B f. Theorem 8.6 (Lebesgue). The bounded function f : Q R 1 is Riemannintegrable if and only if B f is a null set. We start with Claim 8.7. Let f : Q R 1 be a bounded function on a closed brick Q. Then the set B f,ɛ = {x Q: osc f (x) ɛ} is compact. Proof: We ll show that the set R n \ B f,ɛ is open. A minute reflection shows that it suffices to check that if x Q and osc f (x) < ɛ, then there exists a neighbourhood of x, where still osc f < ɛ. Choose δ > such that sup f(y) inf f(y) < ɛ. y x <δ y x <δ 8 Probably, the next result is new for you also in the one-dimensional case.. 51

52 Then, for any z from the ball z x < δ 2, osc f (z) sup f(y) z y <δ/2 inf f(y) z y <δ/2 sup f(y) y x <δ inf f(y) < ɛ, y x <δ and we are done. Proof of the theorem: First, assume that B f is null-set. We need to build a partition Π of Q such that U(f, Π) L(f, Π) is as small as we wish. We choose an arbitrarily small ɛ > and consider the compact set B f,ɛ. Since this is a null-set, B f,ɛ can be covered by finitely many open bricks {S j } with the sum of volumes less than ɛ. Let Π be a partition of Q. By R we denote the closed bricks from this partition. If some R intersects the set B f,ɛ, we choose the brick R so small that it contains in the corresponding brick S 9 j. Then we have an alternative: each closed brick R from this partition is either a sub-brick of one of the bricks S j (in this case, we say that R (I)), or is disjoint with the set B f,ɛ (then R (II)). Clearly, U(f, Π) L(f, Π) = osc f (R)v(R) = + osc f (R)v(R). R Π R (I) R (I I) We estimate separately these two sums. The first sum is small since the total volume of the bricks S j is small: osc f (R)v(R) 2 f v(s j ) < 2 f ɛ, R (I) here f = sup Q f. If R (II), then osc f (x) < ɛ, x R. Splitting, if needed, the closed brick R onto finitely many smaller closed bricks, we assume that osc f (R) = sup R f inf R f < ɛ. (Why this is possible?) Now, we see that the second sum is also small: osc f (R)v(R) < ɛ v(r) ɛv(q). R (II) This proves the result in one direction. j R Π Question 8.8. How the proof uses compactness of the sets B f,ɛ? 9 Otherwise, we just refine it. 52

53 Now, we prove the converse. Assume that f is Riemann-integrable on Q. Since B f = j B f,1/j, it suffices to show that each B f,1/j is a null-set. Given ɛ >, choose a partition Π of Q such that U(f, Π) L(f, Π) < ɛ/j. Let {R } be those of the bricks from this partition that intersect the set B f,1/j. Then, automatically, osc f (R ) 1/j for each brick R. Summing over these bricks, we get R R v(r ) j osc f (R )v(r ) j R Π osc f (R)v(R) < j ɛ j = ɛ. That is, B f,1/j R is a null-set, and B f is a null-set as well. We are done. To better understand this proof, I recommend to check it with all details in the one-dimensional case. 8.3 Properties of the Riemann integral 8.9. The constant function is integrable, and c = cv(q). Q 8.1. If the functions f and g are integrable, then any linear combination is integrable, and (αf + βg) = α f + β g. Q If f is an integrable function, then m f (Q)v(Q) f M f (Q)v(Q). Q Hence if f is a non-negative integrable function, then f. If f and g Q are integrable functions and f g, then f g. Q Q If the function f is integrable, then f is also integrable, and f f. Q If f is an integrable function, and f =, except of a null-set, then f =. Q 53 Q Q Q

54 Hint: for any partition Π of Q, L(f, Π) (since in each cube S of Π there is at least one point where f = ), while U(f, Π) (for a similar reason). Corollary If two integrable functions f 1, f 2 coincide a.e. 1, then their integrals coincide: f 1 = f 2. Q 8.4 Integrals over bounded sets with negligible boundaries Now, we extend the definition of the Riemann integral to any bounded sets E with a negligible boundary E. For any set E R n, we denote by { 1, x E, 1l E (x) =, x / E its indicator-function. 11 This function is continuous on the interiour and exteriour of E and is discontinuous on the boundary E. Let Q be a brick in R n, E Q, f : Q R 1 be a bounded function on Q. Definition E f := Q Q f 1l E. Of course, to make this definition the correct one needs to check several things. We d like to know that the function f 1l E is integrable. Claim The function 1l E is integrable iff the boundary E is a null-set. Claim If the functions f and g are integrable, then their product f g is integrable as well. The both claims are obvious corollaries to the Lebesgue theorem (the second one uses that B f g B f B g ). From now on, writing f, we always assume E that the boundary E is a null-set. 1 i.e. everywhere, except of a null set, a.e. = almost everywhere 11 It is also called the characteristic function of E. 54

55 We also would like to know that the value of the integral f E does not depend on the choice of the cube Q E and on the values of f on Q \ E. Exercise Check this! Now, we define the class of Riemann-integrable functions R(E) on bounded subsets E R n with negligible boundary E N. The bounded function f : E R 1 is Riemann- integrable on E, if the function f 1l E is integrable on some cube Q E. This definition does not depend on the choice of the brick Q. Exercise If E is a compact null-set, then any bounded function on E is integrable, and f =. Next, for any bounded set E, int(e) E f = E f = Exercise Let f be a non-negative Riemann integrable function on a cube Q. If f =, then f = a.e.. Hint: if x is a continuity point of f, then f(x) =. Q All the properties of Riemann integrals given in the previous section persist for integrals over bounded sets E with E N. (Check!) 8.5 Jordan volume Definition 8.2. The bounded set E R n is called Jordan-measurable (E J ) if E is a null-set. In this case, the volume (or content) of E is v(e) = 1l. 55 E Ē f.

56 This definition is rather restricted. For example, there are open sets and even domains which are not Jordan-measurable. Exercise Give example of a domain and a compact set which are not Jordan-masurable. You will learn in the courses Functions of real variables and Measure theory, how Lebesgue fixed that problem. There is another drawback: our definition, at least formally, depends on the choice of coordinates (since we started with the bricks). This will be fixed quite soon. Exercise Check that if v(e) =, then E is a null-set. The converse statement is wrong. It follows from Exercise 8.18 that if the set E is Jordan-measurable, then the sets E and inte are also Jordan-measurable and v(int(e)) = v(e) = v(ē). Exercise A bounded set E is Jordan measurable, iff given ɛ > there exists a partition Π of a cube Q E such that v(s) v(s) < ɛ, S Π + S Π where Π + = {S Π: S E }, Π = {S Π: S E}. That is, the Jordan sets are exactly those sets whose volume can be well approximated from inside and from outside by volumes of elementary sets, that is by volumes of finite unions of bricks: 12 v(e) = sup S Π v(s) = inf S Π + v(s). Note that if E 1 and E 2 are the Jordan sets, then the sets E 1 E 2 and E 1 E 2 are Jordan as well, and v(e 1 E 2 ) + v(e 1 E 2 ) = v(e 1 ) + v(e 2 ). 12 In the Lebesgue theory, all open sets and all compact sets are measurable. If the set Ω is open, then v(ω) def = sup{v(s): S Ω}, where S is a finite union of bricks, and if the set K is compact, then where Ω is open. v(k) def = inf{v(ω): Ω K}, 56

57 This follows from the identity 1l E1 E 2 + 1l E1 E 2 = 1l E1 + 1l E2, and linearity of the Riemann integral. Thus, the Jordan volume is a finitely additive set-function: if the Jordan sets E 1,..., E N are disjoint (or, at least, have disjoint interiours), then ( N ) v E j = j=1 N v(e j ). j=1 Another property of the Jordan volume is its invariance with respect to the translations of R n : if E J, then, for any c R n, E + c J (why?) and v(e + c) = v(e). Note (this is very important!), that the Jordan volume is determined (up to a positive multiplicative constant) by these two properties: Theorem 8.24 (uniqueness of the Jordan volume). The n-dimensional Jordan volume is the only function v : J R 1 + satisfying the following conditions: (i) finite additivity; (ii) translation invariance; (iii) normalization v(q) = 1, where Q = [, 1] n is the unite cube in R n. Proof: of this remarkable fact is fairly easy. Let v be another function with the properties (i) (iii). Then on dyadic cubes with the side length 2 m (m Z) the function v takes the same value (property (ii)), and hence coincides with v (property (i)). Thereby, v coincides with v on all bricks in R n (since any brick in R n can be approximated by a finite union of dyadic cubes), and hence on the whole J. The theorem we proved yields another very important fact: invariance of the volume with respect to orthogonal transformations. Theorem Let O L(R n, R n ) be an orthogonal transformation. Then for any Jordan set E, the set OE is Jordan as well, and v(oe) = v(e). Proof: First, note that if E is a Jordan set, and O is an orthogonal transformation of R n, then OE is also a Jordan set. Indeed, the transformation x Ox maps the boundary to the boundary (since this is an open map) and preserves the class of null-sets (since this map is a Lipschitz one). Then we consider the function v O : J R + acting as follows: v O (E) = v(oe). This is a finitely-additive and translation invariant function. Hence, 57

58 by the previous theorem, there is a non-negative constant c such that, for any E J, v O (E) = cv(e). It remains to check that c = 1. To this end, consider the unit ball B R n. Clearly, this is a Jordan set since its boundary (the unit sphere) is a union of two graphs of continuous functions. Obviously, OB = B, whence cv(b) = v O (B) = v(b). Since B is not a null-set, this yields c = 1. We finish this lecture with another useful and simple theorem: Theorem 8.26 (MeanValueTheorem). If G is a domain with negligible boundary G N, and f C(G, R 1 ) and bounded, then ξ G such that f = f(ξ)v(g). Indeed, we know that G inf G f v(g) G f sup f v(g). G Since the function f is continuous, for any c (inf U f, sup U f) there exists ξ U such that f(ξ) = c. In particular, this holds for c = f. G 58

59 9 Fubini Theorem In this part, we learn how to reduce multiple integrals to iterated onedimensional integrals. Start with heuristics. Let f be a continuous function on a rectangle Q = [a, b] [c, d]. Then (9.1) Q f(x, y) dxdy = b a { d c } f(x, y) dy dx = d c { b a } f(x, y) dx dy. The integrals on the RHS are called the iterated integrals. The idea is simple: consider the Riemann sums with the special choice of points ξ ij = (x i, y j ), ξ i,j Q i,j = A i B j, Then f(x i, y j ) A i B j = ( ) A i f(x i, y j ) B j = ( ) B j f(x i, y j ) A i i,j i j j In the limit, we get (9.1). 9.1 The statement Let A R n, B R m be bricks, f R(A B), that is the multiple integral (9.2) f(x, y) dxdy exists. A B Theorem 9.3 (Fubini). The iterated integrals ( ) ( ) dx f(x, y) dy, dy f(x, y) dx A A exists and equal the multiple integral (9.2). B The following example shows that one needs some vigilance applying Fubini s theorem: Example { 1 } x y (x + y) dy dx = 12 1 { 1 }, but x y 3 (x + y) dx dy = B i. 59

60 Let Q be a brick, and g a bounded function on Q. Introduce the upper and lower Darboux integrals g = inf U(g, Π), Q Π Q g = sup L(g, Π). Of course, g g, and g is integrable iff the upper and the lower integrals Q Q coincide and equal the regular one. Now, we define the function F (x) = f(x, y) dy. B If for some x A the integral f(x, y) dy does not exist, then we define B F (x) to be any number between f(x, y) dy and f(x, y) dy. In the course B B of the proof, we shall see that that { } x A: f(x, y) f(x, y) dy B B is a null-set, so, in fact, it is not important how F (x) was defined on that set. 9.2 Proof of Fubini s theorem Choose partitions Π A of A and Π B of B, and denote the corresponding partition of A B by Π = Π A Π B. If S is a brick from Π, then S = A i B j (A j A, B j B are bricks), and v n+m (S) = v n (A i ) v m (B j ). Π 6

61 Now, we have L(f, Π) = i,j = i i i i i i i,j ( ) inf f v n (A i )v m (B j ) A i B j ( ) inf f(x, y)v m (B j ) v n (A i ) x A i, y B j j ( ) inf f(x, y)v m (B j ) v n (A i ) y B j inf x A i inf x A i sup x A i ( j j B f(x, y) dy ) v n (A i ) m F (A i )v n (A i ) M F (A i )v n (A i ) i ( ) sup f(x, y) dy v n (A i ) x A i B ( ) sup f(x, y)v m (B j ) v n (A i ) y B j sup f(x, y)v m (B j )v n (A i ) = U(f, Π). A i B j Thus L(f, Π) i m F (A i )v n (A i ) i M F (A i )v n (A i ) U(f, Π). Hence, F R(A), and A F dx = A B f(x, y) dxdy. We are done. 9.3 Remarks 9.5. If f R(A B), then the sets { } x A: f(x, y) dy doesn t exist B 61

62 and { y B : are the null-sets. Proof: Consider the function x A } f(x, y) dx doesn t exist f(x, y) dy B B f(x, y) dy. These function is integrable (as the difference of two integrable functions), non-negative, and has zero integral over A. Hence, the function vanishes everywhere on A, except of a null-set Suppose f(x, y) = ϕ(x) ψ(y), where ϕ R(A), ψ R(B). Then f R(A B), and f = ϕ ψ. A B Proof: The integrability of f follows from the Lebesgue theorem, the rest follows from the Fubini theorem. as A In many cases, the domain D of multiple integration can be represented D = {(x, y) R n+1 : x E, f 1 (x) y f 2 (x)}, where E R n is a Jordan set, and f 1, f 2 are continuous functions on E, f 1 f In these assumptions, the set D is Jordan, and for any continuous function g on D, ( ) f2 (x) g = dx g(x, y) dy. D E f 1 (x) Proof: To check that D is Jordan, we look at its boundary D. It is the union of three sets: D = {(x, y): x E, f 1 (x) y f 2 (x)} {(x, f 1 (x)): x E} {(x, f2 (x)): x E}. B All three sets are null-sets, thus D is the null-sets as well. 62

63 We take a closed interval I f 1 (E) f 2 (E). Then E I D. Set h = g1l D, and apply Fubini s theorem: ( ) g = h = dx h(x, y) dy D E I E I ( ) f2 (x) = dx g(x, y) dy. E f 1 (x) Exercise 9.8. Compute the integral (x x x 2 3) dx 1 dx 2 dx 3, T where T is the simplex in R 3 bounded by the planes {x 1 + x 2 + x 3 = a}, {x i = }, 1 i 3. Answer: a 5 /2. Exercise 9.9. Find the volume of (i) the intersection of two solid cylinders in R 3 : {x 2 1 +x 2 2 1} and {x 2 1 +x 2 3 1}; Answer: 16/3. (ii) the solid in R 3 under paraboloid {x x 2 2 x 3 = } and above the square [, 1] 2. Answer: 2/3. Exercise 9.1. Find the integrals (x x 2 n) dx 1... dx n, [,1] n [,1] n (x x n ) 2 dx 1... dx n. Exercise Let f : R 1 R 1 be a continuous function. Prove that x x1 dx 1 dx 2... xn 1 Example Compute the integral dx n f(x n ) = x [,1] n max(x 1,..., x n ) dx 1...dx n. (x t)n 1 f(t) dt. (n 1)! First of all, by symmetry, we assume that 1 x 1 x 2... x n, and multiply the answer by n!. Then max(x 1,..., x n ) = x 1, and we get n! 1 x1 x 1 dx 1 dx 2... xn 1 dx n = n! 63 1 x n 1 dx 1 (n 1)! = n n + 1.

64 Exercise Compute the integral Answer: 1 n+1. [,1] n min(x 1,..., x n ) dx 1...dx n. 9.4 The Cavalieri principle Intuitively clear, that we can recover the volume of three-dimensional body, integrating the areas of its two-dimensional slices. The next theorem gives the precise statement. Let S R n be a closed brick, I R 1 closed interval, Q = S I, and let E Q be a Jordan set. We denote its n-dimensional sections by E(y) = {x S : (x, y) E}. 64

65 Theorem For a.e. y I, the n-dimensional slice E(y) is a Jordan set, and v n+1 (E) = v n (E(y)) dy. I Proof: Note that 1l E (x, y) = 1l E(y) (x). By Remark 9.5, we know that for a.e. y I there exists the integral 1l E (x, y) dx = 1l E(y) (x) dx =: v n (E(y)). S Again, by Fubini, v n+1 (E) = 1l E (x, y) dxdy = S I S I ( ) dy 1l E(y) (x) dx = v n (E(y)) dy. S I Done! Corollary 9.15 (Cavalieri). Let P, Q R 3 be Jordan sets, and P (c) := {(x, y, z) P : z = c}, Q(c) := {(x, y, z) Q: z = c}, be their two-dimensional horizontal slices. If, for almost every c R 1, area(p (c)) = area(q(c)), then v(p ) = v(q). Example 9.16 (volume of the unit ball in R 3 ). Inspecting the figure we see 65

66 that vol(b) = 1 1 area(d( 1 x 2 ) dx). Here D ρ is the disc of radius ρ. The latter integral equals 1 ( π (1 x 2 ) dx = π 2 2 ) = 4π Of course, volume of the ball of radius r in R 3 equals 4πr 3 /3. Exercise 9.17 (Archimedes). The volumes of an inscribed cone, the halfball, and a circumscribed cylinder with the same base plane and radius, are in the ratios 1 : 2 : 3. Exercise Show that the volume of the ball of radius r in R n equals v n r n. Compute the constant v 4. In the same way one can compute the volume of the unit ball in R n. The answer differs in the cases of even and odd dimensions: (2π) k v 2k+1 = v 2k+1 (B) = 2 (2k + 1)!!, v 2k = v 2k (B) = (2π)k (2k)!!. We skip the computation since later we ll find a simpler way to do this. Example 9.19 (volume of truncated cone). Let G R 2 be a bounded open set with negligible boundary. Consider the truncated cone with apex (,, t) and height h t: {(( C = 1 s ) ) } x, s : x G, s h. t 66

67 Then, by Cavalieri s principle, h ( vol(c) = area(g) 1 s ) 2 ds t ) = area(g) (h h2 t + h3 = area(g) h (1 3t 2 Exercise 9.2. Find the volume of the n-dimensional simplex Answer: 1 n!. T = {x: x 1,..., x n, x x n 1}. ) ht h2 +. 3t 2 Exercise Suppose the function f depends only on the first coordinate. Then 1 f(x 1 ) dx = v n 1 f(x 1 )(1 x 2 1) (n 1)/2 dx 1, B 1 where B is the unit ball in R n, and v n 1 is the volume of the unit ball in R n 1. The next two exercises deal with a very interesting phenomenon of concentration of high-dimensional volume. Exercise Let B r be a ball of radius r in R n. Compute v n (B r \ B.99r ) v n (B r ) 67

68 for n = 3, n = 1, and n = Given ɛ >, the quotient decays as e ɛn when n. v n (B r \ B (1 ɛ)r ) v n (B r ) Exercise Let B be the unit ball in R n, and P = {x B: x n <.1}. What is larger v n (P ) or v n (B \ P ) if n is sufficiently large? 2. Given ɛ >, show that the quotient tends to zero as n. Hint: the quotient equals v n ({x B: x n > ɛ}) v n (B) 1 ɛ (1 t2 ) (n 1)/2 dt 1 (1 t2 ) (n 1)/2 dt 3*. Find the asymptotic behaviour of that quotient as n. 68

69 1 Change of variables 1.1 The theorem Recall the change of variables in one-dimensional Riemann integrals: suppose ϕ: [a, b] R 1 is a C 1 -injection and f : [ϕ(a), ϕ(b)] R 1 is a continuous function. Then (1.1) ϕ(b) f = b ϕ(a) a (f ϕ)ϕ. Before writing the n-dimensional counterpart of this formula, let us observe a subtle difference in definition of the Riemann integral you ve learnt in Hedva- 2 and the new one. In Hedva-2, the integral β was defined as the limit of Riemann sums g(ξj )(α j+1 α j ), α g where α = α < α 1 <... < α N = β is a partition of the interval I = [α, β]. Our new definition starts with the Riemann sums 13 g(ξj ) I j = g(ξ j ) α j+1 α j. Therefore, the change of the orientation I I changes the sign of the old integral, and does not affect the new one 14. Now it is not difficult to guess that in our situation the counterpart of (1.1) looks as follows: f = (f ϕ) ϕ. ϕ(i) I Definition 1.2. A (C 1 -) diffeomorphism is a bijection T such that both T and T 1 are C 1 -mapping. Definition 1.3. The determinant J T = det D T is called the Jacobian of the mapping T. of the linear operator D T 13 We have not mentioned Riemann sums at all, and worked with the Darboux sums, but let us discard such a detail. 14 Later, we ll return to the idea of oriented integration 69

70 Theorem 1.4. Let U R n be a bounded domain, T : U T U a diffeomorphism, f bounded continuous function on T U. Then for any Jordan set Ω, Ω U, f = (f T ) J T. T Ω Ω Since the proof of this theorem is not short, we have not tried to optimize the assumptions. Here is the plan we will follow: 1. First, we will show that for any linear transformation L L(R n, R n ) and any Jordan set Ω R n, v(lω) = det L v(ω). 2. On the next step, we prove the infinitesimal version of the theorem which says that v(t Q) lim Q x v(q) = J T (x). 3. On the last step, we introduce the additive set-functions and complete the proof of the theorem. Exercise 1.5. If the set Ω U is Jordan, Ω U, and T : U R n is a diffeomorphism, then the set T (Ω) is Jordan as well. 1.2 v(lω) = det L v(ω) First of all, note that it suffices to prove this only for the standard unit cube Q in R n. Indeed, define a set-function v L (Ω) = v(lω). It is finitely additive and translation invariant. Thus, to show that v L = det L v, it suffices to check this on the unit cube Q. We give two proofs of the identity v(lq) = det L v(q) Polar decomposition of non-singular operators The 1-st proof is based on the polar decomposition of non-singular (i.e., invertible) linear operators, you ve probably learnt in Linear Algebra. First, assume that the operator L is singular (i.e., is not invertible). Then the image LQ lies in a proper linear subspace of R n, hence v(lq) =. Since det(l) =, there is nothing to prove in this case. Now, assume that L is not singular. We use the following Claim 1.6 (Linear Algebra). Any nonsingular linear transformation L L(R n, R n ) is a product of a self-adjoint one and the orthogonal one. 7

71 Question: why this decomposition 15 is called the polar one? Now, we are ready to prove the identity v(lq) = det L v(q) in the case when the operator L is non-singular. In view of the the polar decomposition, it suffices to check it only for orthogonal and positive operators. For orthogonal operators U, we already know that v(uq) = v(q), and since the absolute value of the determinant of U is one, again, there is nothing to prove. If L is a positive operator, then, without lost of generality, we assume that it is already diagonal (otherwise, we just change the orthogonal basis in R n, we know that the volume does not depend on the choice of the basis. Then LQ = [, λ 1 ]... [, λ n ], where λ i > are the eigenvalues of L (recall that all of them are positive, since L is positive). Thereby, n v(lq) = λ i = det(l)v(q). i= Volume of the parallelepiped in R n. Gram determinants The second proof is based on the computation of the volume of an arbitrary non-degenerate parallelepiped in R n. Let u 1,..., u m be vectors in R n. They generate the parallelepiped m P = P (u 1,... u m ) = {x = t j u j : t j [, 1]}. It is a subset of the linear subspace E m spanned by the vectors u 1,..., u m. Here, we shall compute the m-dimensional volume v m (P (u 1,..., u m )) based on the following rule: if P (u 1,... u m 1 ) is the base of P (u 1,..., u m ), then j=1 v m (P (u 1,... u m ) = y v m 1 (P (u 1,... u m 1 )), where y is the orthogonal projection of the vector u m on the one-dimensional subspace E m E m 1 (that is the height of P (u 1,..., u m ): Thus, we need to find the length of the vector y. In fact, you ve learnt how to do this in the Linear Algebra course. Claim 1.7. y 2 = Γ(u 1,... u m ) Γ(u 1,... u m 1 ), 15 For the sake of completeness, we recall the proof: Consider the operator LL. This is a positive operator: for any x R n \ {}, LL x, x = Lx 2 >. Hence, there exists a (unique) positive square root P = LL. Set U = P 1 L, then UU = P 1 LL P 1 = P 1 P 2 P 1 = I. That is, the operator U is orthogonal, and L = P U is the decomposition we were looking for. 71

72 where u 1, u 1 u 1, u 2... u 1, u m u 2, u 1 u 2, u 2... u 2, u m Γ(u 1,... u m ) = u m, u 1 u m, u 2... u m, u m is the Gram determinant. Note that Γ(u 1,..., u m ) = det (U U), where u u m1 U =..... u 1n... u mn is the matrix whose j-th column consists of the coordinates of the vector u j in the chosen orthonormal basis {e i } in R n. In other words, U is the matrix of the linear operator L such that u i = Le i. Corollary 1.8. v m (P (u 1,... u m )) = Γ(u 1,... u m ). 72

73 Indeed, by the claim, v 2 m (P (u 1,... u m )) = Γ(u 1,... u m ) Γ(u 1,... u m 1 ) v2 m 1 (P (u 1,... u m 1 )) Proof of the Claim: Let Then = = Γ(u 1,... u m ) Γ(u 1, u 1 ) v 2 1 (P (u 1 )) = Γ(u 1,... u m ) u 1 2 u 1 2 = Γ(u 1,... u m ). u m = u i, u m = m 1 i=1 m 1 j=1 α j u j + y. α i u i, u j + u i, y. Thus the last column of the determinant is the linear combination of the first m 1 columns (with the coefficient α i ) and the column u 1, y. u m 1, u m =.. u m, y y 2 Putting this column on the m=th place instead of the original one, we get the claim. Note 1.9. The proof of the claim also gives the properties of the Gram matrix which we will use later: and Γ(u 1,... u m ), Γ(u 1,... u m ) = if and only if the vectors u 1,..., u m are linearly dependent. Note 1.1. The proof also reveals the geometric meaning of the claim: assume, we have the vectors u 1,..., u m and we want to approximate the vector v by the linear combination of these vectors. In other words, we are looking for m min α 1,...,α m v α j u j. 73 j=1

74 Then this minimum (i.e. the distance between v and the linear span of u 1,..., u m ) equals Γ(u 1,... u m, v). Γ(u 1,... u m ) Exercise m Γ(u 1,... u m ) Γ(u 1,... u k ) Γ(u k+1,... u m ), 1 k Hint: first, show that for 1 j k 1, Γ(u j, u j+1,... u m ) Γ(u j+1,... u m ) Γ(u j, u j+1,... u k ) Γ(u j+1,... u k ) 2. Γ(u 1,... u m ) u u m 2 ; that is the volume of the P (u 1,... u m ) never exceeds the volume of the brick with the length-sides u 1,..., u m. 3. When the equality sign attains in these inequalities? 4. (Hadamard s inequality) Let a 11 a a n1 A = a 1n a 2n... a nn Then A 2 n a 2 1k 1 n a 2 2k... 1 n a 2 nk. In particular, if a ij 1 for all i and j, we get A n n/2. Return to the volume of LQ, Q the unit cube in R n, L L(R n, R n ). Let e 1,..., e n be the orthonormal basis in R n, u j = Le j. Then u j, u k = Le j, Le k = L Le j, e k, that is, Γ(u 1,..., u m ) = det(l L), and v 2 (LQ) = (det(l)) 2. Thus we got the second proof that v(lq) = det(l). 1.3 The infinitesimal version Let T be a C 1 diffeomorphism of an open set U. Intuitively, if a cube Q U is small, then the mapping T on Q is lose to its linear part, i.e. to T (x) + D T (x)(x x ), x Q, and v(t Q) det D T (x) v(q). Recall that the determinant det D T (x) is called the Jacobian of T at x. We use notation Q x which means that cubes Q contain the point x and diam(q). 74 1

75 Theorem Let T be a C 1 -diffeomorphism of an open set U. Then, for each x U, v(t Q) lim Q x v(q) = J T (x). Proof: The idea is straightforward. Given ɛ >, choose δ > so small that if the diameter of Q is = δ, and y Q, then (1.13) T (y) T (x) D T (x)(y x) < ɛ y x ɛδ If y fills the whole cube Q, then the vector T (x) + D T (x)(y x) fills a parallelepiped P of volume J T (x) v(q). Next, by (1.13), P ɛδ T (Q) P +ɛδ. Here we use the following notations: A +η = {x: dist(x, A) η} is the η- neighbourhood of the set A, and A η = {x: dist(x, R n \ A) η}. It remains to estimate the volumes of the thin layer P +ɛδ \ P ɛδ. The volume of this layer is majorized by the sum of the volumes of the ɛδ-neighbourhoods of the faces of P. The volume of such a neighbourhood is bounded from above by 2ɛ δ times the n 1-dimensional volume of the n 1-dimensional 75

76 ɛδ-neigbourhood of the face. The length-sides of the parallelepiped P are bounded by C(D T )δ, thus this n 1-dimensional ɛδ-neigbourhood is contained in an n 1-dimensional parallelepiped with the length-sides bounded by C(D T, n)δ, and hence the volume of this neighbourhood is C(D T, n)δ n 1. We get the estimate for the volume of the thin layer: or Thus C(D T, n)ɛδ δ n 1 = C(D T, n)ɛ δ n. v(t Q) v(p ) C(D T, n)ɛ δ n, v(t Q) J T (x) v(q) C(D T, n)ɛ δ n. Recall that the diameter of Q is = δ, that is, v(q) c(n)δ n. We get v(t Q) v(q) J T (x) C(D T, n)ɛ. This completes the proof. 1.4 The additive set-functions We denote by J the collection of all Jordan-measurable subsets of R n ; by J (U) we denote the collection of all Jordan-measurable subsets of the open set U. Definition The function µ: J (U) R 1 is called an additive setfunction if it satisfies the following conditions: additivity: µ(ω 1 Ω 2 ) = µ(ω 1 ) + µ(ω 2 ), Ω 1, Ω 2 J (U), Ω 1 Ω 2 = ; continuity from below: if Ω j Ω, then µ(ω j ) µ(ω); differentiability with respect to the cubes: there exists the derivative µ (x) = lim Q x µ(q) v(q). 76

77 Example: µ f (Ω) = where f is a bounded continuous function on U. The additivity is obvious. The continuity from below follows from the boundedness of f: f f Ω Ω j f 1l = f (v(ω) v(ω j ) ). Ω\Ω j To see that the RHS tends to zero we use the following Claim Let Ω j Ω be an exhaustion of Ω by Jordan subsets. Then Ω f, lim v(ω j) = v(ω). j Proof of the claim: Given ɛ >, choose a finite union of open cubes U Ω such that v(u) < ɛ. If j is large enough, then Ω \ Ω j U (why?). Then v(ω) v(ω j ) = v(ω \ Ω j ) v(u) < ɛ. The differentiability of the set-function µ f follows by the mean value theorem: for each cube Q there exists ξ Q such that µ f (Q) = f(ξ)v(q). Apparently, this example is generic : Theorem Let µ be an additive set-function, such that µ (x) is bounded and continuous. Then, for any Ω J (U), µ(ω) = µ (x) dx. Ω Proof: it suffices to prove the theorem in the case µ (x) (then we can apply this special case to the set-function µ µ f, f = µ ). WLOG, we assume that Ω is a cube, having the result for the cubes, we easily get the general case by approximating the Jordan sets from below by finite unions of the cubes. We need to show that µ(q) = for any cube Q. Let the result be wrong; i.e. there exists the cube Q such that µ(q ), then, for some λ >, µ(q ) λv(q ). Then, using the bisections, we get a nested sequence of cubes Q j such that the length sides of Q j are twice smaller than those of Q j 1, and µ(q j ) λv(q j ). Clearly, Q j x, and µ (x). Contradiction. 77

78 1.5 Proof of the change of variables theorem: Assume that Ω U and set µ(ω) def = f. T (Ω) This is the additive set-function. Indeed, the additivity is obvious. Also, the continuity from below: µ(ω) µ(ω j ) = f, T (Ω)\T (Ω j ) the sets T (Ω j ) are Jordan (why?) and they exhaust the whole set T (Ω). As above, the differentiability follows by the mean value theorem: for any cube Q, there exists ξ Q such that thus µ(q) v(q) µ(q) = f(t (ξ) )v(t Q), = f(t (ξ) ) v(t Q) v(q) f(t (x) ) J T (x) when Q x. Thus, µ (x) = f(x) J T (x), and we apply the previous theorem. 1.6 Examples and exercises Exercise 1.17 (spherical coordinates in R 3 ). Consider the map F : R 3 R 3, F (r, ϕ, θ) = r(cos ϕ sin θ, sin ϕ sin θ, cos θ). (i) Find and draw the images of the planes and of the lines r = const, ϕ = const, θ = const, (ϕ, θ) = const, (r, θ) = const, (r, ϕ) = const. (ii) Prove that F is surjective but not injective. (iii) Show that J F = r 2 sin θ. Find the points (r, ϕ, θ), where F is regular. (iv) Let V = R + ( π, π) (, π). Prove that F V is injective. Find U = F (V ). (v) Compute (find the formula) the inverse map F 1 on U. 78

79 Exercise Compute the integral [,1] 2 cos 2 (π(x + y)) dxdy. Exercise Compute the integrals 1. dxdydz x 2 + y 2 + (z 2), 2 2. x 2 +y 2 +z 2 1 x 2 +y 2 +z 2 1 Answers: π ( log 3), π ( log 3). dxdydz x 2 + y 2 + (z 1/2) 2. Exercise 1.2. Compute the integral dxdy (1 + x 2 + y 2 ) 2 taken (i) over one loop of the lemniscate (x 2 + y 2 ) 2 = (x 2 y 2 ); (ii) over the triangle with vertices at (, ), (2, ), (1, 3). Hint: use polar coordinates. Exercise Compute the integral over the four-dimensional unit ball: e x2 +y 2 u 2 v 2 dxdydudv. x 2 +y 2 +u 2 +v 2 1 Hint: The integral equals ( e x2 +y 2 x 2 +y 2 1 Then use the polar coordinates. u 2 +v 2 1 (x 2 +y 2 ) Volume of the ellipsoid in R n ) e (u2 +v 2) dudv dxdy. We start with a special case, and find the volume of the ellipsoid { } n x 2 j x: 1. j=1 a 2 j 79

80 Changing the variables x j = a j y j, we easily find that the volume of this ellipsoid equals a 1 a n v(b), where B is the unit ball in R n (we already know its volume). { } x In particular, if n = 2, we get that the area of the ellipse + y2 1 a 2 b { 2 } x is abπ, and if n = 3, then the volume of the ellipsoid a 2 + y2 b 2 + z2 1 c 2 is 4π 3 Now, consider a general n-dimensional ellipsoid in R n given by {x: Ax, x 1}, where A is a non-negative linear transformation. We make an orthogonal transformation x = Oy which reduces the matrix of A to the diagonal form. This transformation does not change the volume of the ellipsoid, and applying the special case considered above, we get the answer: v(b) λ1 (A)... λ n (A), where λ 1 (A),..., λ n (A) are the eigenvalues of A. Exercise Compute the integral xyz dxdydz taken over the ellipsoid {x 2 /a 2 + y 2 /b 2 + z 2 /c 2 1}. Answer: a2 b 2 c 2 6. Exercise Find the volume cut off from the unit ball by the plane lx + my + nz = p Volume of the body of revolution in R 3 Take a plane domain Ω that lies in the right half-plane of the (ρ, z)-plane, and consider a body of revolution obtained by rotation of the domain Ω in R 3 around the z-axis; i.e. Ω = {(ρ cos θ, ρ sin θ, z): (ρ, z) Ω, θ 2π}. Introduce the cylindric coordinates: x = ρ cos θ, y = ρ sin θ, z = z. Then dxdydz = ρ dρdθdz, and 2π v( Ω) = dθ ρ dρdz = 2π ρ dρdz. Ω Ω 8

81 Often, Ω is defined as {(ρ, z): a < z < b, < ρ < ρ(z)}. Then and we get Ω ρ dρdz = b a dz v( Ω) = π ρ(z) b a ρ dρ = 1 2 ρ 2 (z) dz. b Center of masses and Pappus theorem a ρ 2 (z) dz, Suppose we have a system of N material points (P i, m i ), 1 i N, in R 2, P i = (X i, Y i ) are the points, and m i are the masses. The center of mass of this system is located at the point P = mi P i mi. 81

82 If we have a continuous distribution p of masses in a plane domain Ω, then the total mass of Ω is m(ω) = p, and the coordinates of the center of masses are xp(x, y) dxdy Ω X =, Y = m(ω) Ω Ω yp(x, y) dxdy m(ω) The integrals in the numerator are called the 1-st order moments of the distribution p. The same formulas hold in R 3 (and, more generally, in R n ). Exercise Show that 1. the mass of the ball of radius r centered at the origin with density distribution p(x, y, z) = x 2 y 2 z 2 is M = 4πr9 945, 2. the mass of the ellipsoid {x 2 /a 2 + y 2 /b 2 + z 2 /c 2 1} with density distribution p(x, y, z) = x 2 + y 2 + z 2 is. M = 4πabc 15 ( a 2 + b 2 + c 2). The distribution of masses is homogenenous if p is a constant function. Sometimes, the center of masses of a homogeneous distribution is called centroid. In this case, we always the use normalization p 1 which, of course, does not affect the position of the centroid. Exercise (a) Prove that the centroid of the triangle lies in the intersection of its medians. (b) Suppose that Ω is a finite union of triangles i, and P i are the centroids of i. Prove that the centroid P of Ω coincides with the centroid of the system of points P i with masses m i = area( i ). Of course, we can define the center of masses and centroid also for 3- (or n-) dimensional bodies. Exercise Find the centroids of the following bodies in R 3 : 1. The cone built over the unit disc, the height of the cone is h. 82

83 2. The tetrahedron bounded by the three coordinate planes and the plane x a + y b + z c = The hemispherical shell {a 2 x 2 + y 2 + z 2 b 2, z }. 4. The octant of the ellipsoid {x 2 /a 2 + y 2 /b 2 + z 2 /c 2 1, x, y, z }. We conclude with a beautiful ancient Theorem 1.27 (Pappus). The volume of the body of revolution Ω obtained by rotation of the plane domain Ω equals the area of Ω times the length of the circle described by the centroid of Ω. Proof: As we know, v( Ω) = 2π ρ dρdz = 2π Ω Ω ρ dρdz area(ω) area(ω). 83

84 Observe that the centroid of Ω describes the circle of the radius Ω R = ρ dρdz. area(ω) Example Consider the solid torus T in R 3 obtained by rotation of the disc {(ρ, z): (ρ c) 2 + z 2 r} around the z-axis. Its volume is v(t ) = 2πc πr 2 = 2π 2 cr 2. 84

85 11 Improper Integrals The definition of the Riemann integral we gave above has several drawbacks. For instance, it does not allow us to integrate unbounded functions, or consider unbounded domains of integrations. In this lecture, we fix this Definition Let Ω m Ω be an exhaustion of an open set Ω by Jordan sets Ω m. We already know that 1. if Ω J, then v(ω m ) v(ω); 2. if Ω J and f R(Ω), then f Ω m We want to accept the second property as the definition of the integral Ω f in the cases when the function f is unbounded, or the domain Ω is unbounded 16 The problem is that the different exhaustions can give the different answers, but we look for the definition of the integral which does not depend on the exhaustion. For example, consider Then but lim n Ω 1 + x 1 + x 2 dx. n 1 + x lim dx = lim n n 1 + x arctan x n n n 2 log(1 + x2 ) n = π, n 2n n 1 + x dx = lim 1 + x arctan x 2n 2 n n The good news is the following f log(1 + x2 ) 2n n = π + log 2. Claim If f, and Ω m Ω, Ω k Ω are two exhaustions of an open set Ω, then the following limits are equal: lim f = lim f. m Ω k m Ω k 16 One can use the same approach in the case when the domain Ω is not Jordan. 85

86 Proof: Let Ω m f A, Ω k f B, A, B [, + ]. Then Ω k f (ii) = lim f lim m + Ω k Ω m m f = A. Ω m Hence, B A. By symmetry, A B. Done.. Example 11.2 (Poisson). Consider the integral R 2 e (x2 +y2) dxdy. First, let us exhaust the plane by the discs Ω m = {x 2 + y 2 m 2 }. In this case, Ω m e (x2 +y2) dxdy = 2π dθ m e r2 r dr = π(1 e m2 ) π. Now, consider the exhaustion by the squares Ω k = { x, y k}. We get k k ( 2 e (x2 +y 2) dxdy = e x2 dx e y2 dy e dx) x2. Ω k k Juxtaposing the answers, we obtain the celebrated Poisson formula: k e x2 dx = π. The corresponding n-dimensional integral equals: (11.3) Rn e (Ax,x) dx = πn/2 det A, here A is a positive linear transformation. First, observe that ( n (11.4) e x 2 dx = e dt) t2 = π n/2. R n 86

87 Also observe that (11.5) e at2 dt = π a. If the matrix of A is diagonal, then (11.3) follows from (11.4) and (11.5). If the matrix is not diagonal, we make the orthogonal transformation x Ox which reduced the matrix of A to the diagonal one. Example Let f(x) = x α, α >. We consider separately two cases: Ω 1 = B, and Ω 2 = R n \ B. First, suppose that we integrate over the unit ball B. We split the ball into the layers C n = 2 k x 2 1 k, k 1. If x C k, then integrand is between 2 α(k 1) and 2 αk. Also, c 1 2 kn vol(c k ) c 2 2 kn (the constants c 1 and c 2 depend on the dimension n only). Hence the integral B dx x α converges and diverges simulteneously with the series k 1 2(α n)k. We see that the integral converges if α < n, and diverges otherwise. In the second case, we use a similar decomposition into the layers {2 k x 2 k+1 } and obtain the series k 1 2(n α)k. Hence, the second integral converges iff α > n. In particular, the both integrals never converge simulteneously. One more Example dxdy = (1 x 2 y 2 ) α x 2 +y 2 <1 2π = 2π 1 2 dθ 1 1 rdr (1 r 2 ) α ds (1 s) α = π Of course, the computation has a sense only if α < 1. 1 dt t α = π 1 α. Definition If for any exhaustion Ω m Ω, f R(Ω m ), there exists the limit lim f, m Ω m that does not depend on the choice of the exhaustion, then we say that the integral f converges and equals to this limit. Ω 87

88 Now we give a very useful majorant sufficient condition for convergence: Claim Suppose that f g on Ω and that the integral g converges. Ω Also suppose that for any exhaustion Ω m Ω by Jordan sets, f R(Ω m ). Then the integrals f and f converge. Ω Ω Proof: Since f R(Ω m ), f R(Ω m ) (the Lebesgue criterium). Fix ɛ >, and choose large m, k, m > k. Then f Ω m f = Ω k f Ω m \Ω k g = Ω m \Ω k g Ω m g < ɛ, Ω k { } if k is sufficiently large. Thus the sequence of integrals converges, Ωm and the integral f exists. Ω Now, set f + = max(f, ), f = ( f) + = max( f, ), and observe that f = f + f, where f, f + f. Hence, there exist the integrals f ±, and therefore there exists the integral f = f + The proof is complete. Exercise Compute the integral dx x, where Q = [, 1] 2 is the unit square in R 2. Exercise Compute 1. Ω Ω Ω Q Ω f. R 2 ax + by e (x2 +y2 )/2 dxdy, 2. x, a p e x 2 dx, a R n, p > 1. R n 88

89 Hints: 1. Rotating the plane, introduce new coordinates (x, y ) such that x = ax+by a. 2 +b 2 2. The general case is reduced to a = (,...,, a ). Exercise Prove that dξ x ξ 2 y ξ = c 2 x y. R 3 Try to find the positive constant c. Exercise For which values of p and q the integral dxdy x p + y. q converges? x + y 1 Exercise Find the sign of the integral log(x 2 + y 2 ) dxdy. max( x, y ) 1 Exercise Verify if the integrals dxdy 1 + x 1 y 1 converge or diverge? R Useful inequalities R 2 e (x+y)4 dxdy Here are the integral versions of the classical inequalities of Cauchy-Schwarz, Hölder and Minkowski. By L p (R n ) we denote the class of Riemann-integrable functions in R n such that f p def = ( ) 1/p f <. R n Exercise (Cauchy-Schwarz, Hölder). 1. Suppose f, g L 2 (R n ). Then f g R n 2 f 2 R n g 2. R n 89

90 2. Suppose f L p (R n ), g L q (R n ), = 1. Then p q ( ) 1/p ( ) 1/q f g f p g q. R n R n R n Hint: use the inequality ab ap p + bq q, a, b R +. Exercise (Minkowski). If f, g L p (R n ), 1 p <, then f + g p f p + g p, 1 p <. Hint: start with a + b p a a + b p 1 + b a + b p 1, then use Hölder s inequality The Newton potential The gravitational force F exerted by the particle of mass µ at point ξ on a particle of mass m at point x is F = γmµ µ (x ξ) = γm x ξ 3 x ξ, γ is the gravitational constant. This is the celebrated Newton law of gravitation. The function U : x µ is called the Newton (or gravitational) x ξ potential. The reason to replace the force F by the potential U is simple: it is easier to work with scalar functions than with the vector ones. If the force F is known, then one can write down the differential equations of motion of the particle (Newton s second law) mẍ = F, or ẍ = γ µ x ξ. Then one hopes to integrate these equations and to find out where is the particle at time t. What happens if we have a system of point masses µ 1,..., µ N at points ξ 1,..., ξ N? The forces are to be added, and the corresponding potential is U(x) = N j=1 µ j x ξ j. Now, suppose that the gravitational masses are distributed with continuous density µ(ξ) over a portion Ω of the space. Then the Newton potential is defined as µ(ξ) dξ U(x) = ξ x Ω 9

91 (the integral is a triple one, of course), and the corresponding gravitational force (after normalization γ = 1, m = 1) is again F = U. Let us compute the Newton potential of the homogeneous mass distribution (i.e., µ(ξ) 1 within the ball B R of radius R centered at the origin: U(x) = B R dξ x ξ. By symmetry, U is a radial function, that is it depends only on x. Exercise Check this! Thus, it suffices to compute U at the point x = (,, z), z. We use the spherical coordinates: ξ 1 = r sin θ cos ϕ, ξ 2 = r sin θ sin ϕ, ξ 3 = r cos θ. Then U = R π dr 2π r 2 sin θ dθ R (z r cos θ)2 + r 2 sin 2 θ = π dr 2π r 2 sin θ dθ z2 2zr cos θ + r 2 } {{ } V The underbraced expression V is the Newton potential of the homogeneous sphere of radius r. We compute V using the variable t 2 = z 2 2zr cos θ + r 2. Then z r < t < z + r, and t dt = zr sin θ dθ. We get V = 2πr 2 z+r z r t dt zr t = 2πr ( ) r 2 ( z + r z r ) = 4π min z z, r. Now, we easily find U by integration: U = R If x is an external point, i.e., z R, then V dr. R r 2 4πR3 U = 4π dr =. z 3z. If x is located inside the ball, i.e., z < R, then ( z r 2 R ) ( ) z 2 U = 4π z dr + r dr = 4π z 3 + R2 2 z2 = 2π 2 3 ( 3R 2 z 2). 91

92 Thus, Done! U(x) = 4πR 3 x if x R, 2π 3 (3R2 x 2 ) if x < R. Observe that 4πR 3 /3 is exactly the total mass of the ball B R. That is, together with Newton, we arrived at the conclusion that the gravitational potential, and hence the gravitational force exerted by the homogeneous ball on a particle is the same as if the whole mass of the ball were all concentrated at its center, if the point is outside the ball. Of course, you heard about this already in the high-school. Another important conclusion is that the potential V of the homogeneous sphere does not depend on the point x when x is inside the sphere! Hence, the gravitational force is zero inside the sphere. The same is true for the homogeneous shell {ξ : a < ξ < b}: there is no gravitational force inside the shell. Exercise Check that all the conclusions are true when the mass distribution µ(ξ) is radial: µ(ξ) = µ(ξ ) if ξ = ξ. I.e., compute the mass of the ball B R, the potential of the ball B R, and the potential of the shell {x: R 1 < x < R 2 }. Exercise Find the potential of the homogeneous solid ellipsoid (x 2 + y 2 )/b 2 + z 2 /c 2 1 at its center. 2. Find the potential of the homogeneous solid cone of height h and radius of the base r at its vertex. Problem Show that at sufficiently large distances the potential of a solid S is approximated by the potential of a point with the same total mass located at the center of mass of S with an error less than a constant divided by the square of the distance. The potential itself decays as the distance, so the approximation is good The Euler Gamma-function Definition Γ(s) = t s 1 e t dt, s >. 17 This estimate is rather straightforward. A more accurate argument shows that the error is of order constant divided by the cube of the distance. 92

93 We know that Γ(s + 1) = sγ(s), Γ(n) = (n 1)! (start with Γ(1) = 1), and Γ( 1) = t 1/2 e t dt = 2 e x2 dx = π. 2 Exercise Find the limits lim s sγ(s) and lim s Γ(αs) Γ(s). There are two remarkable properties of the Γ-function which we d like to mention here without proof. The first one is the identity Γ(s)Γ(1 s) = π sin πs that extends the Γ-function to the negative non-integer values of s. second one is the celebrated Stirling s asymptotic formula Γ(s) = 2πs s 1/2 e s e θ(s), < θ < 1 12s The Gamma-function is very useful in computation of integrals. Claim x α 1 (1 x) β 1 dx = Γ(α)Γ(β) Γ(α + β), α, β >. The left hand side is called the Beta-function, and denoted by by B(α, β). Proof: Γ(α) Γ(β) = We introduce the new variables (u, v): { t 1 = u(1 v) t 2 = uv. t α 1 1 t β 1 2 e (t 1+t 2 ) dt 1 dt 2. The This is a one-to-one mapping of the 1-st quadrant {t 1, t 2 > } onto the semistrip {u >, < v < 1}. This can be seen, for example, from the formulas { t 1 + t 2 = u t 1 /t 2 = 1 1. v The Jacobian equals 1 v v u u = u uv + uv = u. 93

94 We obtain Γ(α) Γ(β) = = u du 1 u α+β 1 e u du dv u α 1+β 1 (1 v) α 1 v β 1 e u 1 (1 v) α 1 v β 1 dv = Γ(α + β)b(α, β). There are many definite integrals that can be expressed via the Gammafunction. Example Consider the integral We rewrite it in the form π/2 and change the variable: π/2 sin α 1 θ cos β 1 θ dθ. ( sin 2 θ ) α/2 1 ( cos 2 θ ) β/2 1 sin θ cos θ dθ, sin 2 θ = x, dx = 2 sin θ cos θ dθ. We get ( 1 α 2 B 2, β ) = 1 Γ ( ) ( α 2 Γ β ) Γ ( ). α+β 2 A special case of this formula says that π/2 Exercise Check that π/2 sin α 1 θ dθ = cos α 1 θ dθ = Γ ( ) ( α 2 Γ 1 ) 2 2 Γ ( ) = α Deduce the duplication formula: B(x, x) = 2 1 2x B(x, 1 2 ). π 2 Γ ( ) α 2 Γ ( ). α+1 2 Γ(2x) = 22x 1 π Γ(x) Γ(x ). 94

95 Exercise Show that 1 1 π/2 x 4 1 x 2 dx = π 32, x m e xn dx = 1 ( ) m + 1 n Γ, n x m (log x) n dx = ( 1)n n!, n N, (m + 1) n+1 dx = Γ2 (1/4) cos x 2 2π. We mention without proof another very useful formula x p x dx = π sin πp, < p < 1. There is a simple proof that that uses the residues theorem from the complex analysis course. This formula yields that Γ(s)Γ(1 s) = π (note that sin πs 1 tp 1 (1 t) p dt = x p 1 dx). 1+x The Dirichlet formula We start with the Dirichlet formula: x 1,...,x n, x x n 1 x p x p n 1 n dx 1... dx n = Γ(p 1 )... Γ(p n ) Γ(p p n + 1), p 1,... p n >. Proof: we use induction with respect to the dimension n. For n = 1 the formula is obvious: 1 x p dx 1 = 1 p 1 = Γ(p 1) Γ(p 1 + 1). Now, denote the n-dimensional integral by I n, and assume that the result is valid for n 1. Then I n = 1 x p n 1 n dx n n 1 { }} {... x 1,... x n 1 x x n 1 1 x n 95 x p x p n 1 1 n 1 dx 1... dx n 1.

96 To compute the inner integral, we introduce the new variables x 1 = (1 x n )ξ 1,..., x n 1 = (1 x n )ξ n 1. Then the inner integral equals (1 x n ) n 1+(p 1 1)+... +(p n 1 1)... ξ p ξ p n 1 1 n 1 dξ 1... dξ n 1 Thus, 1 I n = I n 1 (1 x n ) p p n 1 x p n 1 dx n = ξ 1,... ξ n 1 ξ ξ n 1 1 = (1 x n ) p p n 1 I n 1. Γ(p 1 )... Γ(p n 1 ) Γ(p p n 1 + 1) Γ(p p n 1 + 1)Γ(p n ) Γ(p p n + 1) There is a seemingly more general formula:... x 1,...,x n, x γ xγ n n 1 x p x p n 1 n dx 1... dx n = n 1 γ 1... γ n Γ = Γ(p 1 )... Γ(p n ) Γ(p p n + 1). ( ) p 1 γ 1... Γ ( ) p n γ n Γ ( p 1 γ p n γ n + 1 It is easily obtained from the previous one by the change of variables y j = x γ j j. There is a special case which is worth mentioning: p 1 =... = p n = 1, γ 1 =... = γ n = p:... dx x 1,...,x 1... dx n = n x p xp n 1 p n Γ ( ) Γ n 1 p ( n p + 1 ). We ve found the volume of the unit ball in the metric l p : ( ) 2 n Γ n 1 p v n (B p (1)) = ( ). p n n Γ + 1 p Of course, if p = 2, the formula gives us the volume of the standard unit ball: v n = v n (B) = 2πn/2 nγ ( ). n 2 We also see that the volume of the unit ball in the L 1 -metric equals 2n n!. Question: what the formula gives us in the p limit? 96 ).

97 Exercise x x n 1 x 1,...,x n = Γ(p 1)...Γ(p n ) Γ(p p n ) ϕ(x x n )x p x p n 1 n ϕ(u)u p 1+...p n 1 du. dx 1...dx n 97

98 12 Smooth surfaces We start with smooth (two-dimensional) surfaces in R 3 and their tangent planes. Then we define and briefly discuss smooth k-dimensional surfaces in R n, k n. The cases k = 1 and k = n 1 correspond to the lines and hyper-surfaces in R n, the extreme cases k = and k = n correspond to points and domains in R n. In the case k = 2, n = 3, we get (two-dimensional) surfaces in R 3. We finish with discussion of normal vectors to hyper-surfaces Surfaces in R 3 There are three definitions of smooth surfaces in R Graph of function The surface M R 3 is a graph of function z = f(x, y). The class of smoothness of the surface M is defined according to the class of smoothness of f Zero set of a smooth function The surface M R 3 can be defined as the zero set of a smooth function: M = {(x, y, z): F (x, y, z) = }. Locally, this definition is equivalent to the previous one. Obviously, graph of the function z = f(x, y) can be viewed as the zero set of the function z f(x, y) =. To move in the opposite direction, we say that he point P (x, y, z ) M is called regular, if F (P ) =, but F (P ). Wlog, suppose F z (P ). Then, by Implicit Function Theorem, we can solve equation F (x, y, z) = near the point P, i.e., we find the function z = f(x, y) such that z = f(x, y ), and F (x, y, f(x, y)) in a neighbourhood of (x, y ) Parametric surfaces are defined as the image in R 3 of a nice domain G R 2 under a C 1 -injection r : G R 3. The variables (u, v) are called the local parameters on the surface M. Let, in the coordinates, x(u, v) r(u, v) = y(u, v). z(u, v) 98

99 The point P (r(u, v )) M is called regular if the matrix of the derivative D r has rank two, i.e. ) rank (u, v ) = 2. ( x u x v y u y v z u z v Claim If P M is a regular point, then in a neighbourhood of P the surface M is graph of a function. Proof: Suppose, for instance, that x u x v y u y v, i.e. Jacobian of the mapping (12.2) { x = x(u, v) y = y(u, v) does not vanish at (u, v ). Then by the Inverse Function Theorem the mapping (12.2) can be inverted in a neighbourhood of (u, v ): { u = u(x, y) u = u(x, y ), v = v(x, y) v = v(x, y ). Substituting this into expression for z, we get z = f(x, y) def = z (u(x, y), v(x, y)), z = f(x, y ) = z(u, v ). Thus, near regular points all three definitions of surfaces coincide! The tangent plane Let M be a parametric surface, G domain of the local coordinates (u, v), the point r(u, v ) be regular. Consider a curve γ(t) = (u(t), v(t)) G passing through the point (u, v ) and its image, the curve r(t) = r(u(t), v(t)) M passing through the point P : The tangent ( = velocity) vector of the curve r(t) is ṙ(t) = r u u + r v v. Since (u, v ) is a regular point, the vectors r u = (x u, y u, z u ) and r v = (x v, y v, z v ) are linearly independent. Thus any tangent vector to M 99

100 at P is a linear combination of r u and r v (with coefficients u and v), and the tangent plane T P M is a two dimensional linear space spanned by the vectors r u and r v. If M was defined as the zero set of a function F, then equation of a curve on M is F (x(t), y(t), z(t)) =. Differentiation by t, we get F x ẋ + F y ẏ + F z ż =. If F, that is, the point is regular, we get equation for the coordinates (ξ, η, ζ) of the tangent vector: F x (x, y, z )ξ + F y (x, y, z )η + F z (x, y, z )ζ =. If we want to think about tangent vectors as of vectors that start at the point (x, y, z ) M, then we get the affine plane in R 3, its equation is F x (x, y, z )(ξ x ) + F y (x, y, z )(η y ) + F z (x, y, z )(ζ z ) =. 1

101 Exercise Let Σ be the ellipsoid x 2 a 2 + y2 b 2 + z2 c 2 = 1. in R 3. Find the distance p(x, y, z) from the origin to the tangent plane of Σ at (x, y, z) Examples of surfaces in R 3 Ellipsoid in R 3 is defined by equation x 2 a 2 + y2 b 2 + z2 c 2 = 1, that is, this is the zero set of the quadratic polynomial F (x, y, z) = x2 + y2 + a 2 b 2 z 2 1. All points of the ellipsoid are regular. Globally, it cannot be defined c 2 as a graph of a function or by only one coordinate map (why?). The ellipsoid is parameterized by the local coordinates x = a cos ϕ cos θ, y = b sin ϕ cos θ, z = c sin θ, where ϕ 2π, θ π. One sheet hyperboloid is defined by equation x 2 a 2 + y2 b 2 z2 c 2 = 1. All points of this surface are regular. It is defined parametrically as x = a 1 + z2 cos ϕ, y = b 1 + z2 sin ϕ, z = z, where ϕ 2π and < c 2 c 2 z <. Double-sheet hyperboloid (or, elliptic paraboloid) is defined by equation x2 a y2 2 b + z2 2 c = 1. 2 All points of this surface are regular. Each sheet is the graph of the function z = ±c 1 x2 y2. a 2 b 2 Cone C is defined by equation x 2 a 2 + y2 b 2 z2 c 2 =. It has a singular point (,, ). C \ {} is a smooth surface. 11

102 Cylinder is defined by equation x2 + y2 = 1. All points of the cylinder are a 2 b 2 regular. It s parametric description is x = a cos ϕ, y = b sin ϕ, z = z, where ϕ 2π, < z <. Exercise Build a smooth one-to-one map from the punctured plane {(x, y): < x 2 + y 2 < } onto the cylinder. Surface of revolution Given a curve γ : I R 3 lying in the half-plane {x >, y = } we can built the surface of revolution Σ R 3 revolving γ around the z-axis. If γ(s) = (γ 1 (s),, γ 3 (s)), then Σ is defined by the mapping r(s, t) = (γ 1 (s) cos t, γ 1 (s) sin t, γ 3 (s)). The most popular example of surface of revolution is 12

103 Torus obtained by rotation of the circle of radius a centered at (b,, ), a < b. The parametric equations of the torus are x = (b + a cos s) cos t, y = (b + a cos s) sin t, and z = a sin s. Helix and helicoid The helix (or, spiral) is a curve in R 3 defined by t (cos t, sin t, t), < t <. The helicoid is a surface in R 3 defined by (s, t) (s cos t, s sin t, t), s >, < t <. Exercise Draw the pictures of helix and helicoid Equivalent definitions of k-surfaces in R n We give three equivalent definitions of k-surfaces: as graphs, as zero sets, and as images of open sets. The equivalence will again follow from the Implicit and Inverse Function Theorems. 13

104 Graphs of functions A subset M R n (M ) is a smooth k-dimensional surface if for any x M there is a neighbourhood U of x such that M U is the graph of a smooth mapping f of an open subset W R k into R n k : M U = { (w, f(w): w W R k}. Here, we are free to choose which n k coordinates in R n are functions of the other k coordinates. For simplicity, we usually assume that the first k coordinates are free and the last n k coordinates are the functions of them. Observe that the mapping r : W R n, r(w) = (w, f(w)) is a bijection between W and r(w ) = M U. The inverse mapping r 1 : M U W is the projection (w, f(w)) w. According to the class of smoothness of f, we define the class of smoothness of the surface M. Examples: 1. Any open set X R n is a C -surface. We take W = X, f : X R = {}, that is f(x) = for all x X, and identify x X with (x, f(x)). 2. Any point x R n is also a C -surface. Why? Zero sets Let U R n be an open set, F : U R p smooth function, p n, F (U). Consider the zero set Z F def = {x U : F (x) = }. The point z Z f is regular if rankd F (z) = p, i.e., is maximal 18. If z is a regular point of Z F, then there exists a neighbourhood U of z such that Z F U is a smooth k-dimensional surface, k = n p. Indeed, since rankd F (z) = p, we can choose p coordinates, say, x n p+1,..., x n, such that the derivative of F with respect to these variables is invertible. Then we apply the Implicit Function Theorem and conclude that locally the coordinates x n p+1,..., x n are the functions of x 1,..., x n p. Example The unit sphere S n 1 R n is the zero set of the function F (x) = x 2 1. Since F = 2x on S n 1, all points of the sphere are regular. 18 Otherwise, z is a singular point of Z F. 14

105 Parametric surfaces Suppose r : V R n is a smooth mapping, V R k, k n, is an open set. The mapping r is regular if rankd r (v) = k at every point v V. Exercise If rankd r (v ) = k, then rankd r (v) = k in a neighbourhood of v. (Of course, this is just a special case of the Exercise 4.9.) We say that M R n is a smooth k-dimensional surface if for any x M there exist a neighbourhood U of x, an open set V R k, and a regular bijection r : V M U. Formally, this definition is more general than the first one, that uses graphs. Let us check that they are actually coincide. Fix a point a = r(b). Since D r has maximal rank k, we can choose k coordinate functions, say r 1 (v),..., r k (v), such that the corresponding mapping r =. of V into R k has the invertible derivative D r (v). Then, applying the Inverse Function Theorem, we get a neighbourhood W of the point a = r 1 r k a 1. a k, and the inverse mapping h = (r ) 1 : W V. Substituting this mapping into the functions r k+1 (v),..., r n (v), we get the mapping f = r k+1. r n h: W R n such that locally, in a neighbourhood of the point a, M is the graph of f. (Why?) The coordinates v 1,..., v k are called the local (curvilinear) coordinates on M U, the function r is called the coordinate patch, or the map. The maps can overlap : suppose that there are two maps r 1 and r 2 such that r 1 (V 1 ) r 2 (V 2 ). Then we can define the function r2 1 r 1. Exercise The composition map r2 1 r 1 is a C 1 -diffeomorphism between the open sets r1 1 (r 1 (V 1 ) r 2 (V 2 )) and r2 1 (r 1 (V 1 ) r 2 (V 2 )). Question What is the minimal number of coordinate maps needed to define the unit sphere S n 1 R n? Justify your answer. 15

106 12.3 The tangent space Definition Suppose γ : I R n is a C 1 -curve in R n. Then the vector γ 1 (t) γ(t) =. γ n (t) is called the tangent (or velocity) vector of γ at the point x = γ(t). Definition Suppose M is a C 1 -smooth k-surface. Then the vector v R n is called a tangent vector to M at the point x M if there exists a C 1 -curve γ : I M and t I such that γ(t) = x, and γ(t) = v. The set of all tangent vectors to M at x is called the tangent space of M at x and denoted by T x M. Geometrically, we think about elements of the tangent space T x M as of vectors that start at the point x. To better understand this definition, assume first that M is an open set in R n, and x M. Then a minute reflection shows that T x M = T x R n R n. We know that the derivative of a smooth mapping f : M R m at the point x M is a linear operator D f (x) L(R n, R m ). Now, we understand that it acts on the tangent spaces: D f (x): T x M T f(x) f(m). Indeed, if γ : I M is a curve in M passing through the point x = γ(t), then f γ : I R m is a curve in R m passing through the point f(x), and by the chain rule d(f γ) = D f (x) γ. dt Now, let us return to tangent spaces of k-surfaces. Since we have three equivalent definitions of k-surfaces, we get three different ways to find the tangent space. First, we assume that M is defined parametrically. Fix x M. Then we can find a neighbourhood U of x, an open set V R k and C 1 -bijection r : V M U, x = r(b). This establishes a one-to-one correspondence between C 1 -curves x(t) in M U passing through x, and C 1 -curves v(t) in V passing through b: x(t) = (r v)(t). Differentiating this relation, we get ẋ(t) = D r (v(t)) v(t) = k v i (t)r vi (b). i=1 In other words, D r (b) maps T b V bijectively onto T x M. In particular, T x M is a linear subspace of R n of dimension k generated by k vectors r v1 (b),..., r vk (b). 16

107 If M U is the zero set of the C 1 -mapping F : U R n k, we have F (x(t)) for any curve x(t) in M. Differentiating this by t, we get D F (x)ẋ(t) =, that is ẋ(t) kerd F (x), or T x M kerd F (x). Since the both are linear subspaces of R n of dimension k, we get T x M = kerd F (x). For instance, if M is a hypersurface (i.e., k = n 1, and F is a scalar-valued function, then the tangent space T x M is the orthogonal complement to the gradient F (x) Normal vectors to hypersurfaces Suppose M R n is a hypersurface, then dimt x M = n 1 and dim(t x M) = 1. Definition The unit normal N(x) to M at x is a vector from (T x M) such that N(x) = 1. Clearly, N(x) is defined up to its sign 19. Sometimes, the vector N(x) is written in the form (cos θ 1,..., cos θ n ). The angles θ 1,..., θ n [, π] are called the directional cosines. Now, let us look how to find the normal vector. First, suppose that M is the zero set of a smooth function F. Then, as we already know, the gradient of F is orthogonal to any tangent vector, that is F (x) N(x) = ± F (x). If M is a graph of a scalar function f(x 1,..., x n 1 ), we define F (x) = x n f(x 1,..., x n 1 ). Then M is the zero set of F. Notice that F (x) = f x1. f xn 1 1, and F (x) = 1 + f(x) 2. The case when M is defined parametrically is the most involved. Suppose, locally, M is defined by equation x = r(v). The tangent hyperplane T x M is 19 Later we shall use this to orient the hypersurface M. 17

108 spanned by n 1 vectors r v1,..., r vn 1, and we need to learn how to compute the normal vector to this linear span. For this, we need to extend an idea of the vector product from the case of two vectors in R 3 to the case of n 1 vectors in R n. Cross product of n 1 vectors in R n Given n 1 linearly independent vectors R 1,..., R n 1 in R n consider the determinant det(r, R 1,..., R n 1 ) 2. This is a linear functional on R, thus there exists a vector Y that represents this functional: det(r, R 1,..., R n 1 ) = Y, R. The vector Y is called the cross product of the vectors R 1,..., R n 1, and is denoted by Y = R 1... R n 1. It follows from the definition of Y that Y, R i = det(r i, R 1,..., R n 1 ) =, 1 i n 1. Fix an orthonormal basis {E i } 1 i n in R n. The coordinates of Y in this basis are Y i = Y, E i = det(e i, R 1,..., R n 1 ), and Thus Y = n Y i E i = i=1 n det(e i, R 1,..., R n 1 )E i. i=1 E 1 E 2... E n R 1,1 R 1,2... R 1,n R 1... R n 1 = R n 1,1 R n 1,2... R n 1,n where R i,j = (R i, E j ) is the j-th coordinate of R i. Note that the first column of the determinant consists of vectors whilst the other columns consist of scalars. In particular, when n = 3, we arrive to the vector product formula E 1 E 2 E 3 A B = A 1 A 2 A 2 B 1 B 2 B 3. It remains to compute the norm of the cross-product. In fact, we already know how to do this: Claim (linear algebra). Y = Γ(R 1,..., R n 1 ), where Γ(R 1,..., R n 1 ) is the Gram determinant of the vectors R 1,..., R n 1. 2 That is the determinant of the matric which columns consist of these vectors. 18

109 Proof: We have Y 4 = Y, Y 2 = det 2 (Y, R 1,..., R n 1 ). The RHS is the square of the volume of the parallelepiped P (Y, R 1,..., R n 1 ) spanned by the vectors Y, R 1,..., R n 1. We know that it equals to Γ(Y, R 1,..., R n 1 ). Since Y is orthogonal to the rest of the vectors, We are done! Γ(Y, R 1,..., R n 1 ) = Y 2 Γ(R 1,..., R n 1 ). Returning to our problem, we get the useful formula for unit normal vectors N(x) to parametric surfaces: N(x) = ± r v 1... r vn 1 Γ(rv1,..., r vn 1). If n = 3 and r = r(u, v), then r u r v 2 = Γ(r u, r v ) = r u 2 r v 2 (r u, r v ) 2. In differential geometry there are special notations for the quantities that enter the right-hand side: E = r u 2, F = (r u, r v ), and G = r v 2. Then Γ(ru, r v ) = E G F Juxtaposing two computations If in a neighbourhood of the point ξ M the hyper-surface M is defined as the graph of the function: M U ξ = {(x, f(x)): x G R n 1 }, then x 1 x 2 r(x) =. x n 1 f(x 1,..., x n 1 ) and 1 r 1 =., r 2 = 1.,..., r n 1 = f x1 f x2. 1 f xn 1. 19

110 We know that the vectors r 1... r n 1 and one dimensional vector space (T ξ M), that is r 1... r n 1 = C f x1. f xn 1 1 f x1. f xn 1 1. belong to the same It is not difficult to check that C = ( 1) n 1. Indeed, comparing the n-th components of these two vectors, we see that C is the n-th component of the vector r 1... r n 1 ; i.e C = ( ) n = ( 1) n We arrive at the useful corollary: Corollary In the notations as above, Γ(r 1,..., r n 1 ) = 1 + f 2. 11

111 13 Surface area and surface integrals To avoid technicalities, we shall tacitly assume that M is an elementary surface defined by one patch M = r(u), U R k, is a good bounded domain (usually, U is a brick or a ball), r is a C 1 -bijection. In the practical computations, sometimes, we ll need to consider the surfaces that are not elementary, like the unit sphere in R n. In these cases, we just split the domain of integration into finitely many elementary parts, requiring additivity of the area and integrals Fundamental definitions We want to define the k-area on M and the integrals of good functions on M over the k-area. First, let us look at the tangent spaces T u R k and T x M, x = r(u) M. We know that D r : T u R k T x M is a linear bijection, and Γ(r 1,..., r k ) = det(dr(u)d r (u)) is the volume distortion coefficient under the mapping D r. (Here, r j = r u j, 1 j k, are k vectors that span T x M.) Thus, if Ω U is a sufficiently small cube centered at u, then we expect that the k-dimensional area of the distorted cube r(ω) M is Γ(r 1,..., r k ) vol k (Ω). Moreover, the smaller Ω is, the closer area k (r(ω)) vol k (Ω) is to Γ(r 1,..., r k ). Since the k-area on M is supposed to be set-additive, we naturally arrive to the following definition Definition 13.1 (k-dimensional area). A k (r(ω)) = Γ(r1,..., r k ) = Ω Ω det(d r D r ). Definition 13.2 (k-dimensional surface integral). Suppose f is a continuous function on M vanishing outside a compact subset of M. 22 Then f ds = (f r) Γ(r 1,..., r k ) = (f r) det(drd r ) M U 21 Formally, this case is treated by the partition of unity. We will come later to its construction. 22 Of course, we require too much. If M is parameterized by a finite brick or ball U, then it suffices to require that the function f r is continuous on U. U 111

112 These definitions do not depend on the choice of parameterization of M. Suppose ρ: V M is another C 1 -parameterization. Then ϕ = r 1 ρ: V U is a C 1 -diffeomorphism, and U Since (f r) det(drd r ) = (the chain rule), we get as expected. V V (f r) ϕ det(d }{{} r D r ) ϕ det(d ϕ ). }{{} =f ρ = det(dϕ) det(d ϕ) (D r ϕ) D ϕ = D ρ (f ρ) det(dρd ρ ), In the case k = n our definition coincides with our previous definitions of multiple integral and the volume. Now, we examine the cases k = 1 (length and integrals over the curves), k = 2 and n = 3 (surface area and surface integrals in R 3 ), and k = n 1 ( hyperarea in R n ). Exercise Suppose u = ϕ(v); i.e. the old local coordinates u 1,..., u k are C 1 -functions of the new local coordinates v 1,..., v k. Let ρ = r ϕ. Then Γ(ρ v1,..., ρ vk ) = Γ(r u1,..., r uk ) J 2 ϕ, where is the Jacobian of ϕ. J ϕ = det D ϕ = (u 1,..., u k ) (v 1,..., v k ) 13.2 Length and integrals over the curves Let γ : I R n be a C 1 -curve, Γ = γ(i). Then the length of Γ is L(Γ) = γ. If f : Γ R 1 is a good function, then f ds = (f γ)(t) γ(t) dt. Γ I I 112

113 This definition of length is consistent with the geometric one. Let for time being γ be a continuous curve, and let Π be a partition of the segment I = [a, b]: a = t < t 1 <... < t N = b. By Γ Π we denote the corresponding polygonal line inscribed in Γ. It consists of N segments [γ(t j ), γ(t j+1 )], j N 1. Its length equals L(Γ Π ) = N 1 j= Definition 13.4 (length of the curve). γ(t j+1 ) γ(t j ). L(Γ) = sup L(Γ Π ). Π If L(Γ) <, then the curve Γ is called rectifiable. Theorem 13.5 (equivalence of the length definitions). If Γ is a piece-wise C 1 -curve, then Γ is rectifiable, and L(Γ) = I γ. Proof: We split the proof into several simple claims. 1. If a partition Π is finer than Π, then L(Γ Π ) L(Γ Π ). It suffices to check what happens when we add one new point to the partition. In this case, the result follows from the triangle inequality. 2. Additivity of the length: Let c (a, b) be an inner point of I. Set γ 1 = γ [a,c], γ 2 = γ [c,b]. Then L(Γ) = L(Γ 1 ) + L(Γ 2 ). Indeed, let Π 1 be a partition of [a, c], and Π 2 be a partition of [c, b]. We denote by Π = (Π 1, Π 2 ) the corresponding partition of [a, b]. Then L(Γ 1,Π1 ) + L(Γ 2,Π2 ) = L(Γ Π ) L(Γ). In the opposite direction, let Π be any partition of [a, b]. We choose partitions Π 1 of [a, c] and Π 2 of [c, b] such that (Π 1, Π 2 ) is finer than Π. Then L(Γ Π ) L(Γ 1,Π1 ) + L(Γ 2,Π2 ) L(Γ 1 ) + L(Γ 2 ). Remark: the argument shows that Γ is rectifiable if and only if the both curves Γ 1 and Γ 2 are rectifiable. 3. If f : I R n is a continuous function, then f f. I This we already know (and used in the proof of the inverse function theorem). 113 I

114 4. If Γ is a C 1 -curve, then Indeed, for each partition Π, L(Γ Π ) = N 1 j= γ(t j+1 ) γ(t j ) = L(Γ) N 1 j= I γ. tj+1 t j N 1 γ j= tj+1 γ = t j I γ. 5. Due to additivity, it suffices to prove the result for C 1 -curves. Given t [a, b], we set l(t) = L(γ [a,t] ). Then l(t ) l(t ) = L(γ [t,t ] ), and γ(t ) γ(t ) l(t ) l(t ) t γ(t ) γ(t ) t t l(t ) l(t ) t t 1 t t t γ, t t γ. Letting t t, we see that the function l(t) is differentiable with respect to t, and l(t) = γ(t) (and hence l is continuous). Since l(a) =, we get l(t) = In particular, t l = t a a L(Γ) = l(b) = completing the proof of the theorem. b a γ. γ, Example If the curve Γ R 2 is the graph of the function y = f(x), a x b, then b L(Γ) = 1 + f 2 (x) dx. a 2. If the curve Γ R 2 is defined by polar coordinates r = r(θ), α θ β, then its length is β L(Γ) = r2 (θ) + r 2 (θ) dθ. α Exercise Compute the length of the cardioid r = 2(1 + cos θ), θ 2π. 114

115 Exercise Compute where Γ = {(x, y): x2 a 2 Answer: ab a 2 +ab+b 2. 3 a+b + y2 b 2 Γ xy ds, = 1, x, y } is the 1-st quarter of the ellipse. Exercise Suppose γ : I S 2 is a curve on the unit sphere: Show that γ 1 (t) = sin ϕ(t) cos θ(t), γ 2 (t) = sin ϕ(t) sin θ(t), γ 3 (t) = cos ϕ(t). L(γ) = I ϕ 2 + θ 2 sin 2 ϕ Exercise Find the coordinates of the center of masses of the homogeneous curves: (a) arc of the cycloid: x = a(t sin t), y = a(1 cos t), t 2π. (b) the boundary of the spherical triangle x 2 + y 2 + z 2 = a 2, x >, y >, z >. The gravitational force F induced by the curve Γ R 3 with the density distribution µ on a particle with the unit mass at the point x is ξ x F (x) = µ(ξ) dξ ξ x 3 (for simplicity we equal the gravitational constant γ to one). Γ Exercise Find the gravitational force exerted by the homogeneous infinite line in R 3 (µ 1) on the particle of unit mass at the distance h from the line. Hint: choose convenient coordinate system Surface area and surface integrals in R 3 Suppose M R 3 is a C 1 surface defined by the patch r : Ω R 3, Ω is either the square or the disc. Then, according to our general definition, A(M) = Γ(ru, r v ) dudv = r u r v dudv. Ω 115 Ω

116 If we wish to integrate the function f : M R 1 over M, we set f ds = (f r) r u r v dudv. M Ω If the surface M is defined as the graph of the function z = f(x, y), then r x r y = 1 + f 2. If the surface M is defined by the equation F (x, y, z) =, and z can be expressed as a function of x and y: z = f(x, y), then Examples 1 + f 2 = F F z. Example (area of the unit sphere in R 3 ). It is very easy to guess the answer: A(S 2 ) = d dr vol(b r) ( ) 4πr 3 r=1 = = 4π. 3 This computation can be justified. However, instead, we shall compute the area directly. The unit sphere is defined by equation F (x, y, z) =, where F (x, y, z) = x 2 + y 2 + z 2 1. We ll deal with the upper hemi-sphere. Then F z = 2z = 2 1 (x 2 + y 2 ), F 2 = (2x) 2 + (2y) 2 + (2z) 2 = 4, and A(S 2 ) = 2 = 2π x 2 +y 2 <1 1 2 dxdy 2 1 (x 2 + y 2 ) = 2 ds 1 s = 2π 1 dt t = 4π. r=1 2π Now, we extend a little bit the previous computation: 1 r dr dθ 1 r 2 Example (area of the spherical cap and the kissing number). Consider the spherical cap S ψ = {(cos ϕ sin θ, sin ϕ sin θ, cos θ): ϕ < 2π, < θ < ψ}. Then r(ϕ, θ) = (cos ϕ sin θ, sin ϕ sin θ, cos θ), r ϕ = ( sin ϕ sin θ, cos ϕ sin θ, ), r θ = (cos ϕ cos θ, sin ϕ cos θ, sin θ), and r ϕ r θ = sin θ. Thus ψ A(S ψ ) = 2π sin θ dθ = 2π(1 cos ψ). 116

117 Using this computation, we estimate the kissing number of the unit spheres in R 3, that is the maximal number of the unit spheres that touch a given unit sphere. Let us start with the plane case: If the circle S of radius one touches the circle S of radius one, that the angle it is seen from the center of S equals π 3. If N(2) unit circles touch the unit circle S, then the sum of the angles they are seen from the center of S cannot be larger than 2π. Hence, the number N(2) of the circles is 6. The next figure shows that this bound is sharp. Now consider the three-dimensional case. If the unit sphere S touches the unit sphere S, then S can be placed inside the cone with vertex at the 117

118 center of S and of angle π/3 and the area of the spherical cap on S located inside this cone equals 2π(1 cos π/6) = π(2 3). If N(3) unit spheres kiss the given unit sphere S, then the corresponding spherical caps are disjoint. Thus N(3)π(2 3) 4π, or N(3) 4(2 + 3) < 15. This estimate is not sharp. The sharp bound is N(3) = 12. It was known already to Newton, though the accurate proof was given only in the 2th Century. 23 Exercise Find S 2 x 2 i ds, 1 i 3, without ANY actual computation. Hint: the integrals do not depend on i. Exercise (area surface of the intersection of two cylinders). Compute A( K), where K = {x R 3 : x x 2 2 1, x x 3 3 1}. Hint: K = {x x 2 2 1, x x 2 3 = 1} {x x 2 3 1, x x 2 2 = 1}. Answer: 16. Exercise Find the area of the part of the sphere {x 2 + y 2 + z 2 = R 2 } located inside the cylinder {x 2 + y 2 = Rx}. Find the area of the part of the cylinder {x 2 + y 2 = Rx} located inside the sphere {x 2 + y 2 + z 2 = R 2 }. Example (Guldin s rule: area of the surface of revolution in R 3 ). Consider the surface of revolution Σ in R 3 obtained by rotation of the curve z = ϕ(ρ), α ρ β (α > ) around the z-axis. Parametric equations of the surface Σ are x = ρ cos θ, y = ρ sin θ, z = ϕ(ρ). Here θ < 2π, and α ρ β. Then 23 Try Google. r θ = ( ρ sin θ, ρ cos θ, ), r θ 2 = ρ 2, r ρ = (cos θ, sin θ, ϕ (ρ)), r ρ 2 = 1 + ϕ 2 (ρ), (r θ, r ρ ) =. 118

Hence, β A(Σ) = 2π ρ 1 + ϕ 2 (ρ) dρ. α Exercise 13.18. Find the area of the surface Σ obtained by rotation of the curve ρ = ρ(z), a z b around the z-axis.

119 Hence, β A(Σ) = 2π ρ 1 + ϕ 2 (ρ) dρ. α Exercise Find the area of the surface Σ obtained by rotation of the curve ρ = ρ(z), a z b around the z-axis. It is convenient to introduce the arc length, as the new parameter on the curve: ds = 1 + ϕ 2 (ρ) dρ, and denote by L = β α 1 + ϕ 2 (ρ) dρ the total length of the curve. Let ρ(s) be the distance from the z-axis to the point on the curve that cut the length s from the beginning of the curve: Then A(Σ) = 2π L ρ(s) ds. Example (area of the surface of the torus). The torus is obtained by rotation of the circle z 2 + (ρ a) 2 = b 2 (a b) around the z-axis. If θ is the 119

120 polar angle on that circle, then θ = s/b, and ρ(s) = a + b cos(s/b). Then A = 2π 2πb (a + b cos(s/b)) ds = 2πa 2πb. Remarkably enough, the surface area of the torus equals the product of the lengths of two circles that generate the torus. Exercise 13.2 (Archimedus). Area of the spherical belt of height h on the unit sphere {x R 3 : x x x 2 3 = 1, c < x 3 < c + h}, 1 c 1 h, equals 2πh (and thus does not depend on the position of the belt on the sphere!). Exercise Find the volume and the surface area of the solid obtained by rotation of the triangle ABC around the side AB. The length sides a = BC, b = AC, and the distance h from the vertex C to the side AB are given. 12

121 Exercise The density of the sphere of radius R is proportional to the distance to the vertical diameter. Find the mass of the sphere, and the center of masses of the upper hemi-sphere. Answer: the mass equals π 2 R 3, the coordinates of the center of masses are (,, 3 8 R). Exercise Find the centroid of the homogeneous conic surface z = x2 + y 2, < x 2 + y 2 < 1. Answer: (,, 2 3 ) Hyperarea in R n Some useful formulae If the hypersurface M R n is defined as graph of the function: M = {x: x n = ϕ(x 1,..., x n 1 ), (x 1,..., x n 1 ) G}, then, according to our compu- 121

122 tation, the Gram determinant Γ equals 1 + ϕ 2. Thus 1 A n 1 (M) = 1 + ϕ 2 = cos ψ, where ψ is the angle between the vectors. and 1 G ϕ x1. ϕ xn 1 1 G. Exercise Suppose Σ n 1 = {x R n : x i, i x i = 1} is a standard n 1-simplex in R n. Then n A n 1 (Σ n 1 ) = (n 1)!. Similarly, if the hypersurface M is defined by the equation Φ(x 1,..., x n ) =, Φ xn, then f ds = f Φ Φ xn dx 1...dx n 1. M G Here G is the projection of M onto the hyperplane x n =. Now, suppose that the domain V R n is covered by a family of hypersurfaces M c defined by Φ(x 1,..., x n ) = c, a < c < b, in such a way that through each point x V there passes one and only one hypersurface M c. Suppose that Φ in V. Then (13.25) V f(x) dx = b a dc M c f(x) Φ(x) ds. We also assume for simplicity that Φ xn everywhere in V (the general case can be reduced to this one). Then we replace the coordinates x 1,..., x n by new coordinates x 1,..., x n 1, c = Φ(x 1,..., x n ). The corresponding Jacobian is (x 1,..., x n 1, x n ) (x 1,..., x n 1, c) = 1 Φ xn 122

123 (it s faster to compute the Jacobian of the inverse mapping!). Thus, b f(x 1,..., x n ) f(x) dx = dc dx 1...dx n 1 V a M c Φ xn b f(x 1,..., x n ) Φ = dc a M c Φ Φ xn dx 1...dx n 1. }{{} =ds proving (13.25) Integration over spheres The formula (13.25) is very useful. We apply it to the case when M ρ = ρs n 1 is the sphere of radius ρ in R n. In this case Φ(x 1,..., x n ) = x x 2 n, Φ xi = x i ρ, Φ = 1. A minute thought shows that f(x) ds(x) = ρs n 1 f(ρy)ρ n 1 ds(y). S n 1 Exercise Check this! Thus (13.25) gives us (13.27) f(x) dx = rb r ρ n 1 dρ f(ρy) ds(y) S n 1 Differentiating by r, we arrive at d f(x) dx = r n 1 f(ry) ds(y). dr S n 1 rb In particular we find the relation between the volume v n of the unit ball in R n and the hyper-area ω n of the unit sphere in R n : or d dr (v nr n ) = r n 1 ω n, v n = ω n n Now, suppose that f is the radial function; i.e. f(x) = h( x ). Then formula (13.27) gives us r (13.28) h( x ) dx = ω n h(ρ)ρ n 1 dρ, < r. rb 123

124 n 1-area of the unit sphere in R n Making use the latter relation, we easily compute the area of the unit sphere and the volume of the unit ball in R n. First, we use (13.28) with h(ρ) = e ρ2 and r =. The left-hand side equals ( n dx = dt) = π n/2. R n e x 2 R 1 e t2 The integral on the right-hand side equals ρ n 1 e ρ2 dρ = 1 t (n/2 1 e t dt = 1 ( n ) 2 2 Γ. 2 Thus ω n = 2πn/2 Γ ( ). n 2 That is, ω 1 = 2 (explain the meaning!), ω 2 = 2π, ω 3 = 4π, ω 4 = 2π 2,... etc. From here, we once more find the volume v n of the unit ball: Exercise v n (B) = ω n n = 2(π)n/2 nγ ( ) = ( π) n n Γ ( n + 1). 2 2 R n dx (1 + x 2 ) = Γ ( ) p n p πn/2 2 Γ(p) In the next two exercises we use the following notations: R n + = {x R n : x i, 1 i n}, S n 1 + = {x S n 1 : x i, 1 i n}. Exercise Compute the integral S n 1 + y p y p n n ds(y). Hint: integrate the function f(x) = e x 2 x p x p n n over R n +. Check your answer in the special case p 1 =... = p n =. Exercise Suppose a R n +. Then S n 1 + Hint: integrate e a,x over R n +. ds(y) a, y n 1 = 1 (n 1)! a 1... a n. 124

125 Exercise (Poisson). Suppose f is a continuous function of one variable. Then 1 f( x, y ) ds(y) = ω n 1 f( x t)(1 t 2 ) n 3 2 dt. S n 1 1 Hint: due to the symmetry with respect to x, it suffices to consider the case x = (,..., x ). 125

126 14 The Divergence Theorem 14.1 Vector fields and their fluxes Definition 14.1 (Vector fields). Let U R n be an open set. The vector field F on U is the mapping Examples: point x U tangent vector F (x) T x (R n ). the gradient field F = f; the velocity field of a flow of fluid or gas: ẋ(t) = F (x(t)) ( stationary field ), or more generally, ẋ(t) = F (t, x(t)) ( non-stationary field ); the solution x(t) is called trajectory of the field; the force field (gravitational, Coulomb, magnetic) Definition 14.3 (the flux form). ω F (ξ 1,..., ξ n 1 ) = det(f (x), ξ 1,..., ξ n 1 ), ξ 1,..., ξ n 1 T x (R n ), that is, the oriented volume of the parallelepiped P (F (x), ξ 1,..., ξ n 1 ) generated by the vectors F (x), ξ 1,..., ξ n 1. The flux can be written as F (x), N vol n 1 P(ξ 1,..., ξ n 1 ), where N is the oriented unit normal vector to the parallelepiped P (ξ 1,..., ξ n 1 ): N = ξ 1... ξ n 1 Γ(ξ1,..., ξ n 1 ). If F is the velocity field of a flow of liquid in R 3, then the flux form equals the volume of the liquid that runs through the oriented parallelogram P (ξ 1, ξ 2 ) in the unit time. To define the flux of the vector field through the hyper-surface, we need to choose the unit normal N(x) at x M, that depends continuously on x 24. Sometimes, this is impossible, for instance, for the Möbius strip in R 3. Such surfaces are called non- orientable and we shall not consider them. From now on, we always assume that there exists a continuous normal vector field N(x) on M, it defines the orientation of M. 24 If this choice is possible, then there are exactly two choices of continuous normal field. Prove this! 126

127 Definition 14.4 (flux through the hyper-surface). Suppose M is a hypersurface in R n, N(x) is the unit normal to M at x that depends continuously on x M, F is a continuous vector field on M. The flux of F through M equals flux F (M) = F, N ds. If M = G is a boundary of a good open set G R n, then we always choose the unit outward normal N(x), that corresponds to the outward flux through M. How to decide which normal is the outward one? If G is defined as the sublevel set of a C 1 -function, i.e., G = {x: F (x) < c}, then the outward normal to M = G coincides with the normalized gradient: N = F. If F locally M = G is the graph of a C 1 -function, for instance, M U = {x U : x n = f(x 1,..., x n 1 )}, then either G U = {x U : x n < f(x 1,..., x n )}, or G U = {x U : x n > f(x 1,..., x n )}. In the first case, the n-th component of the outward normal is positive, in the second case, it is negative. yz Exercise Find the flux of the vector field F = xz through the xy following surfaces M: (a) M = {x 2 + y 2 = a 2, < z < h}, the boundary surface of the cylinder, the normal N looks outward with respect to the cylinder. (b) M = {x 2 + y 2 < a 2, z = h}, the top of the same cylinder. The normal N looks in the z-direction. (c) M = {x 2 + y 2 + z 2 = a 2, x, y, z > }. The normal N looks outward with respect to the ball. (d) M = {x + y + z = a, x, y, z > }, the z-component of the normal is positive. The problem we want to look at is as follows: Suppose G R n is a domain with good boundary, F (x) is a C 1 -vector field on G. How to compute the outward flux of F through G? There are two key observations which will allow us to guess the right answer. First, notice that the flux F ( G) is the set-additive functions of G: if G = G 1 G2, G 1 G2 =, then outward flux F ( G) = outward flux F ( G 1 ) + outward flux F ( G 2 ) (the integral over G 1 G2 is cancelled). Of course, this set-additive function is defined on a rather restrictive class of sets G. M 127

128 Now, our intuition 25 suggests us to look at the density of this set-additive function. with re- Definition 14.6 (divergence of the vector field). Density of flux F spect to the cubes is called divergence of the vector field F : 1 div F (x) = lim F, N ds. Q x v n (Q) Q Lemma divf = n i=1 F i x i. Exercise Check that div (hf ) = hdiv F + h, F (h is a scalar function), and div( f) = f ( = n i=1 2 iif, the Laplacian of f). Notations: For x = (x 1,..., x n ) R n, we set x i = (x 1,..., x i 1, x i+1,...x n ) (the i-th coordinate is missing). Proof of the Lemma: This will be a straightforward computation. Fix a = (a 1,... a n ), and consider the cube Q = n i=1 [a i, a i + ɛ]. The boundary Q is the union of 2n faces: E i = {x: x i = a i, x i Q i }, E + i = {x: x i = a i + ɛ, x i Q i }, where Q i = k i [a k, a k + ɛ]. Then and 1 v n (Q) Q F, N E i = F i E i F, N ds = 1 n ɛ n =, F, N E + i = F i E + i [F i ( x i, a i + ɛ) F i ( x i, a i )] d x i i=1 Q i n 1 F i ( x i, a i + ɛ) F i ( x i, a i ) v n 1 ( Q d x i. i ) Q i ɛ i=1 We hope that the RHS converges to n F i i=1 x i (a) as ɛ. To justify the limit transition, we use the fact that F is a C 1 -vector field:, F i ( x i, a i + ɛ) F i ( x i, a i ) ɛ = F i x i ( x i, a i + ɛθ i ) = F i x i (a) + o(1), 25 worked out during the proof of the change of variables theorem 128

129 where o(1) tends to zero uniformly in G when ɛ. Integrating over Q i and taking the sum over i, we get the result. Warning: Our computation works only in the Cartesian coordinates. In the Differential Geometry course, you ll learn how to compute the divergence (and other differential operators, like gradient and Laplacian) in other coordinate systems. Meanwhile, we ll use the divergence only in the Cartesian coordinates. Combining these observations, we expect to get the celebrated result: Theorem 14.9 (Lagrange-Gauss-Ostrogradskii). Suppose G R n is an admissible bounded domain, F is a C 1 -vector field in G continuous in G, N is the outward unit normal on G. Then G F, N ds = outward flux F ( G) = G div F = n G i=1 F i x i We will not reveal the formal definition of the class of admissible domains now, it will be given later, when we come to the proof. Right now, we only mention that this class is sufficiently large for all practical purposes. It contains domains defined as the level sets of C 1 -functions {x: F (x) < c}, where F C 1 (R n, R 1 ), F. More generally, this class contains all domains with C 1 -smooth regular boundary. It also contains domains which are the unions of finitely many cubes. In the plane case, any bounded domain whose boundary is a piece-wise smooth regular curve is admissible. From the divergence theorem we immediately obtain the formula for integration by parts in R n. Corollary Suppose f and g are C 1 -functions on an admissible domain G. Then, for each i, 1 i n, u xi v dx = uvn i ds uv xi dx G G Here, N i is the i-th component of the unit outward normal N to G. Proof: follows at once from the divergence theorem: uv N i ds = (uv) xi ds. Exercise Prove: G G f dx = (The both integrands are vector-fields). 129 G G G fn ds.

130 x Exercise Find the flux of the vector field y through the boundary z surface of the cone {x 2 + y 2 = z 2, < z < h}, the normal looks outside of the cone, i.e., its z-component is negative. Hint: It is simpler, to find the flux through the top {x 2 + y 2 = h 2 } of the cone, and then to use the Divergence Theorem. Exercise Let E = {x 2 /a 2 + y 2 /b 2 + z 2 /c 2 1} be the solid ellipse, p(x, y, z) be the distance from the origin to the tangent plane to E at the point (x, y, z). Compute the integrals E p ds, Answer: 4πabc, 4 3 πabc ( 1 a b c 2 ). E ds p. Hint: one can compute these integrals directly, though the divergence theorem makes the computations much shorter. First, compute p and N: x/a 1 2 p = x2 /a 4 + y 2 /b 4 + z 2 /c, N = p y/b 2. 4 z/c 2 x/a 2 x Then observe that 1 = V, N, V = y/b 2, and p = W, N, W = y. p z/c 2 z Exercise Let F = be the vector field in R 3. Show that for any z admissible domain G the flux of F through G equals the volume of G. Exercise Let G R 3 be an admissible domain. Find the surface integrals N ds, R N ds. G x Here R(x, y, z) = y is the radius-vector. (The both integrand are vectorfunctions!). z Answer: the both integrals vanish. G 13

131 Exercise Let H be a homogeneous polynomial in R 3 of degree k, that is, H(r. ) = r k H(. ). Prove that H ds = 1 H. S k 2 B Hint: use the Euler identity: kh = xh x + yh y + zh z. Exercise Compute the outward flux of the vector field F = (k 1 is an integer) through the unit sphere. Answer: {, k = 2m 14.2 The Gauss Integral Consider the vector field 12π k = 2m 1. k+2 E(x) = 1 x 3 x, x R3 \ {}. This is the potential field: E = U, U(x) = 1 x is the potential of the field E. The field E has zero divergence: div E = 3 i=1 x i ( xi ) = x 3 3 i=1 ( ) 1 x 3 x2 i =. 3 x 5 (Alternatively, one can check that U =, i.e., the function U is harmonic.) What the divergence theorem tells us about the flux of E? Let G be an admissible domain, / G. If / G, then E, N ds =. G Now, suppose that G. Remove from G a small ball B ɛ = { x ɛ}, G ɛ = G \ B ɛ. Then E, N ds =, G ɛ or E, N ds = G E, N ds, S ɛ 131 x k y k z k

132 S ɛ = B ɛ. In our case, E, N = cos < (x, N) x 2 (< (x, N) is the angle between the vectors x and N.) On S ɛ : E, N = 1 x 2 = 1 ɛ 2. Thus and we obtain G S ɛ E, N ds = 1 ɛ 2 Area(S ɛ) = 4π, { cos < (x, N) 4π, G ds(x) = x 2, / G. What actually did we compute together with Gauss? Suppose Σ is a smooth surface in R 3 that does not contain the origin. Then the integral cos < (x, N) ds x 2 represents the solid angle subtended by Σ. Indeed, let Σ π : Σ S 2 132

133 be the radial projection of Σ on the unit sphere S 2. We suppose that π is a one-to-one mapping. Consider the solid body K bounded by Σ, πσ and the conic part. On the conic part of K, the vectors x and N are orthogonal. Thus, the flux of E through the conic part is zero, and by the divergence theorem, the fluxes through Σ and πσ are equal. On πσ, E, N = 1 (see above), hence, the flux of E through πσ equals the area of πσ The Green s formulas and harmonic functions The three celebrated Green s formulas follows at once from the divergence theorem: The 1-st Green formula G u dx = G u n ds Here u n = u, N is the (outward) normal derivative of u. Proof: u = div( u). The 2-nd Green formula: u, v dx = G G u v dx + G u v n ds Proof: u v = u v, N, and div(u v) = u, v + u v. n If u = 1, we get the first formula (for v); if v = u we get G u 2 dx = G u u dx + G u u n ds. The LHS of this formula is called the Dirichlet integral of u. The 3-rd Green formula G (u v v u) dx = is the symmetrized form of the 2-nd one: G ( u v n v u ) ds n Properties of harmonic functions. A C 2 -function u is called harmonic if u =. In the two-dimensional case, harmonic functions are intimately linked with analytic functions. Namely, if u is harmonic in a domain G R 2, 133

134 then its complex gradient u x iu y is the holomorphic function in G, i.e., its real and imaginary parts satisfy the Cauchy-Riemann equations: (u x ) x = ( u y ) y, and (u x ) y = ( u y ) x. Many properties of harmonic function in plane domains follow from those of analytic functions. Now, we use the divergence theorem and Green s formulas to prove several fundamental properties of harmonic functions in R n. Suppose G R n is an admissible domain, u is harmonic in G with the gradient continuous in G. Since = div, the gradient vector field u has zero divergence. Hence (i) The flux of the gradient flow of u across G is zero: u n ds =. G Now, from the second Green s formula, we get (ii) If u = on G, then u is a constant function. n and (iii) If u = on G, then u is the zero function. (iv) Mean-value property If u is harmonic in a ball B = B(x, r) of radius r centered at x, then 1 u(x ) = u ds = 1 u. ω n r n 1 B v n r n B First of all, we assume that x =. This will simplify the notations. We apply the 3-rd Green identity to the functions u and v(x) = 1 1 in x n 2 r n 2 the domain G ɛ = B \ { x ɛ}. Note that the function v is harmonic in G ɛ (check this!), and vanishes on the outer sphere x = r. The apply the Green s identity we need to compute the normal derivative v = v, N on n the boundary spheres x = r and x = ɛ. Since v = n 2 x n x 1. x n, and N = 1 x x 1. x n (with the minus sign on the small inner sphere x = ɛ), we have v = n 2 n x n 1 the small inner sphere x = ɛ). Thus, the Green s formula gives us n 2 r n 1 x =r u ds + n 2 ɛ n 1 x =ɛ u ds + x =ɛ (with the plus sign on ( 1 ɛ n 2 1 r n 2 ) u n ds =. It remains to let ɛ and to check what happens with the second and third surface integrals. Since u is continuous, the second integral converges to (n 2)ω n u() (as above, ω n is the hyper-area of the unit sphere S n 1 R n ). Since u is bounded, the third integral converges to zero. n 134

135 Exercise Fill the details! Thus we get the first mean-value formula. The second formula follows from the first one by spherical integration. In the two-dimensional case n = 2, the same proof works with the function r v(x) = log x x, though you can deduce the same from the mean value property of analytic functions. Exercise Fill the details! (v) Maximum principle Harmonic functions have no local maxima or minima. More precisely, if for some x G, u(x ) = max G u, then u is the constant function. Exercise Give the proof. Hint: the set M = {x G: u(x) = max G u} is closed and open. Hence, it coincides with G. (vi) Liouville Theorem If u is a harmonic function in R n bounded from above (or from below). Then it is a constant function. Hint: Suppose u. Fix any two points x and y. Choose large r and R > x y + r. Then by the mean-value property u(x) = 1 u < 1 v n r n B(x,r) v n r n B(y,R) = u ( R r ) n 1 u = v n R n B(y,R) ( ) n R u(y). r It remains to send r and R in such a way that R/r 1. We get u(x) u(y). By the symmetry, u(x) = u(y). You will learn more about harmonic functions in the course of partial differential equations. Exercise (solution of the Poisson equation). Let u be a C 2 -function that vanishes outside of a compact set in R n, n 3, f = u. Show that f(ξ) dξ u(x) = c n x ξ. n 2 R n Compute c n. Guess how the corresponding formula looks for n = 2. Hint: fix x, and apply the 3-rd Green formula in B(x, R) \ B(x, ɛ), where R is sufficiently large and ɛ is small, to the functions u(ξ) and v(ξ) = x ξ 2 n. Then let ɛ. 135

136 15 Proof of the Divergence Theorem Here, we ll prove the Divergence Theorem. First, we consider the case when the boundary is smooth. Then we prove a more general version which is sufficient for most of the applications. In the course of the proof we ll meet two important constructions: the partition of unity and the cut-off function Smooth boundary Definition Let G R n be a bounded domain. We say that G has a smooth boundary Γ = G, if x Γ ball B x centered at x, and C 1 -function g such that Γ B x = {ξ B x : ξ n = g(ξ 1,..., ξ n 1 )}, G B x = {ξ B x : ξ n < g(ξ 1,..., ξ n 1 )} (after possible re-numeration of the coordinates). Note that at each boundary point x Γ the unit outward normal is well-defined. We fix a covering O of G by the balls B x ; if x Γ, then their radii ρ x are chosen as above, if x G, we always assume that ρ x < dist(x, Γ) Partition of unity Suppose K R n is a compact set with a given covering O by balls: K B x. x K Definition Partition of unity on K subordinated to the covering O is a finite collection of C 1 -functions {ϕ j } such that (i) ϕ j ; (ii) j ϕ j 1 in a neighbourhood of K; (iii) ϕ j vanishes outside of some ball 1 2 B x j. It is not difficult to construct a partition of unity. Take { (ρ 2 x 4 x y 2 ) 2, x y 1 2 ψ x (y) = ρ x, otherwise 136

137 (as above ρ x is the radius of the ball B x ). Then consider the covering K x K 1 4 B x, The centers of the balls from the sub- and choose a finite sub-covering. covering are x 1,..., x M. Set ϕ j (y) def = ψ xj (y) ψ x1 (y) ψ xm (y) (ϕ j (y) = if ψ j vanishes at y). The properties (i) (iii) hold. (Check them!) In the same way, one builds C k -partitions of unity. Exercise Construct a C -partition of unity subordinated to a given covering of the compact set K. Hint: use the function h(t) = { exp( 1/t), t >, t as the building block Integration over Γ Recall, that we have not defined yet the integral over the whole Γ, only over Γ B x. Now, having at hands continuous partitions of unity on Γ, we readily define the integral over Γ: for any f C(Γ), we set Γ f ds def = j Γ fϕ j ds. The integrals on the RHS are defined since on the RHS the integration is local, it is taken only over Γ B xj. Of course, the continuity of f can be replaced by a weaker assumption (after all, we have the Lebesgue criterium of the Riemann-integrability), but we will not pursue this. The definition does not depend on the choice of the partition of unity. Indeed, if {ϕ k } is another partition of unity, then, by additivity of the Riemann integral, fϕ j ds = fϕ j ϕ k ds, Γ k Γ 137

138 so that j fϕ j ds = Γ j,k = k fϕ j ϕ k ds Γ fϕ k Γ j ( ) ϕ j ds = k Γ fϕ k ds The theorem and its proof Now, we are ready to formulate and then to prove the divergence theorem in the smooth case. Theorem Suppose G R n is a bounded domain with a smooth boundary Γ, N is the outward unit normal to Γ, and F is a C 1 -vector field on G. Then F, N ds = divf dx. Γ First, observe that it suffices to prove the result in the special case when the field F is localized ; i.e. given a covering of G by balls G G x G B x, we can always assume that F outside of a ball B x from this covering. Indeed, we construct a C 1 -partition of unity {ϕ j } on G subordinated to this covering. Then we apply the special case to the vector fields ϕ j F, and add the results. We consider separately two cases: x G and x Γ. Assume, first, that x G, then B x G as well (this was our choice of the radius ρ x ). Thus F vanishes on Γ, and we need to check that the integral of the divergence also vanishes. For each i, we denote ŷ i = (y 1,..., y i 1, y i+1,..., y n ), the i-th coordinate is missing. Then G F i y i dy = dŷ i Fi y i dy i, to avoid cumbersome notations, we skip the limits of integration. The inner integral vanishes since F i vanishes on the boundary of B x, thus F i / y i dy =. G 138

139 Hence the divergence integral over G also vanishes. Now, consider the second case x Γ. Then Γ B x = {ξ B x : ξ n = g(ξ 1,..., ξ n 1 )}. We assume that for any x Γ the outward normal N(x) is not parallel to any of the coordinate axes, i.e., g xi, 1 i n 1 (otherwise, we just rotate a bit the coordinate system). Then, for any i, Γ B x can be represented as Then Γ B x = {ξ B x : ξ i = g i ( ξ i )}, 1 i n. F i N i ds = Γ B x ) F i ( ξi, g i ( ξ i ) d ξ i. On the other hand, using Fubini s theorem, we get G B x F i ξ i dξ = d ξ i gi ( ξ i ) F i ξ i dξ i (the lower limit in the inner integral is inessential since ) anyway F i vanishes therein). Thus the inner integral equals F i ( ξi, g i ( ξ i ), and F i F i N i ds = dξ. Γ B x G B x ξ i Done! 15.2 Piece-wise smooth boundary Here, we assume that the boundary Γ = G is decomposed into two parts: a smooth one which is large, and a bad one which is small: Γ = Γ K, where Γ is a finite union (maybe, disconnected) of smooth patches, and at each point x Γ condition 15.1 holds (in particular, the unit outward normal to G is well-defined at these x s), K is a bad compact set such hat (15.5) v n (K +ɛ ) = o(ɛ), as ɛ. Here, K +ɛ is an open ɛ-neighbourhood of K. We shall call such sets (n 1)-negligible compacts 26. Note that a finite union of (n 1)-negligible compacts, is again an (n 1)-negligible compact. 26 Look at Google for the notion Minkowski dimension (or box dimension ). 139

140 Example 15.6 (compact elementary (n 2)-surface is (n 1)-negligible). Suppose Σ = {(y, f 1 (y), f 2 (y)): y Q}, where Q R n 2 is a closed cube. Then Hence, Σ +ɛ {(y, x 1, x 2 ): y Q +ɛ, x i f i (y) < ɛ, i = 1, 2}. v n (Σ +ɛ ) dy Q +ɛ as ɛ. f1 (y)+ɛ f 1 (y) ɛ f2 (y)+ɛ dx 1 dx 2 = 4ɛ 2 v n 2 (Q +ɛ ) = o(ɛ), f 2 (y) ɛ Exercise Show that if the compact K R n is (n 1)-negligible, then for each ɛ > it can be covered by cubes Q j such that l(q j ) is the length-side of Q j. l(q j ) n 1 < ɛ, j Exercise The packing number P (K, ɛ) of a compact set K R n is the maximal cardinality of an ɛ-separated subset in K 27. Show that the compact K is (n 1)-negligible iff P (K, ɛ) = o(ɛ 1 n ) for ɛ. The covering number C(K, ɛ) of K is the minimal cardinality of a set of balls of radius ɛ that cover K (the centers of these balls, generally speaking, do not belong to K). Show that the compact K is (n 1)-negligible iff C(K, ɛ) = o(ɛ 1 n ) for ɛ. Now, we formulate the version of the divergence theorem we mean to prove: Theorem Suppose G R n is a bounded open set, Γ = G can be decomposed into a smooth part Γ and a bad part K which is (n 1)-negligible, and F is a C 1 -vector field on G. Then (15.1) divf dx = F, N ds G Γ 27 The set {x i } is called ɛ-separated if x i x j ɛ for i j. ( ) 14

141 The idea First, suppose that the vector-field vanishes in a neighbourhood of a bad compact set K. Then the proof given above works without any changes and gives us (15.1). Now, we approximate the field F by another one that vanishes in a neighbourhood of K. For this, given ɛ >, we build a C 1 -cut-off function ψ = ψ ɛ such that (i) ψ 1 everywhere; (ii) ψ 1 in the ɛ-neighbourhood of K; (iii) ψ vanishes outside of the 3ɛ-neighbourhood of K; (iv) For ɛ, ψ = o(ɛ), ψ = o(1). R n R n A bit later, we ll build such a cut-off function for any compact (n 1)-negligible set K and any ɛ >. Meanwhile, an instructive exercise, is to consider the case when K is a point in R 1 (then the cut-off with these properties does not exist), and in R 2 (when the cut-off does exist). Having at hands the cut-off function ψ, we easily complete the proof as follows. We apply the divergence theorem to the vector field (1 ψ)f that vanishes near the bad set K and get div [(1 ψ)f ] dx = (1 ψ)f, N ds. G Γ Now, we let ɛ. Since ψ vanishes at that limit, we expect to get (15.1). First, look at the LHS. Since div(1 ψ)f divf = ψ, F ψdivf, we need to estimate two integrals over G. For this, we use that F C 1 (G) and the property (iv) of the cut-off ψ: ψ, F dx max F ψ dx = o(1), G R n G and ψdivf dx max divf ψ dx = o(ɛ). G G R n Now, look at the RHS. Here, the situation is even more simple: we can think that Γ is an elementary patch. Since the F, N is a bounded function on Γ, ψf, N ds Γ F, N ds, Γ 141

142 as ɛ. This step does not need the property (iv) of the cut-off function. (Fill the details!). This proves the theorem modulo the construction of the cut-off function The cut-off function We ll smooth the indicator function 1l K+2ɛ. For this, we fix a C 1 -function χ with the following properties: (a) χ vanishes outside of the ball 1B of radius 1 centered at the origin; 2 2 (b) χ is non-negative, and χ() > ; (c) Clearly such a function exists. χ = 1. Exercise Construct a C -function with these properties. We scale this function: χ ɛ (x) = ɛ n χ(x/ɛ), and finally set (15.12) ψ(x) = 1l K+2ɛ (y)χ ɛ (x y) dy R n (15.13) = 1l K+2ɛ (x y)χ ɛ (y) dy. R n We ll readily checks that ψ satisfies the properties (i)-(iv) stated above. Clearly, ψ is non-negative, and ψ(x) χ ɛ (x y) dy = R n χ(y) dy = ɛ n R n χ(y/ɛ) dy = 1. R n We get (i). If x K +ɛ, then x y K +2ɛ (remember that y ɛb in (15.13)). Hence, for these x s, ψ(x) = χ ɛ (y) dy = 1, R n that is, (ii). If x / K +3ɛ, then for the same reason, x y / K +2ɛ, and the integrand vanishes. Thus ψ(x) = fur such x s. The integral estimates in (iv) are also simple: ψ = R n 1l K+2ɛ R n χ ɛ = o(ɛ), R n 142

143 and ψ R n 1l K+2ɛ R n completing the argument. χ ɛ = o(ɛ) O(1/ɛ) = o(1), R } n {{} = 1 ɛ R n χ 143

144 16 Linear differential forms. Line integrals 16.1 Work (motivation) We start with motivation: suppose F is a force field in R 3, γ : I R 3 piecewise smooth path of motion of particle in the field F, that is γ(t) is a position of the particle at time t, γ(t) is a velocity of the particle at time t, F (γ(t)) the force acting on the particle at time t. We want to compute the amount of work W done by the field F moving the particle along γ. Let us recall how this problem was solved in the high-school physics. First, suppose that the field F is constant. If we move the particle along the segment [P, Q] R 3, then W = F, Q P. If the path is not straight and the field is not constant, we consider a partition Π of the segment I = [a, b]: a = t < t 1 <... < t N = b, and use the additivity of the work: W (γ, F, Π) = N F (γ(t j )), γ(t j ) γ(t j 1 ). j=1 144

145 Rewriting the RHS as N j=1 F (γ(t j )), γ(t j) γ(t j 1 ) t, t = t j t j 1, t we easily recognize the integral sum. Thus, in the limit t, we get W = F (γ(t)), γ(t) dt. I In the coordinates, F = (F x, F y, F z ), γ(t) = (x(t), y(t), z(t)), and ( ) dx W = F x dt + F dy y dt + F dz z dt, dt or symbolically I W = γ F x dx + F y dy + F z dz. The RHS is called the line integral over the curve γ of the linear differential form F x dx + F y dy + F z dz. Note that it does not depend on the choice of parameterization of γ! 16.2 Linear differential forms. Differentials Let f be a C 1 -function in a neighbourhood of x R n. Let us recall that its differential 28 df x is a linear function on the tangent space T x R n. Indeed, if γ is a smooth curve passing through x, γ(t) = x, then f(γ(t)) is a smooth curve passing through f(x), and d dt f(γ(t)) = df x( γ(t)) ( = n i=1 ) f γ i (t). x i Definition The set of linear functionals on the tangent space is called the co-tangent space, and denoted by (T x R n ). Let us fix the orthonormal basis e 1,..., e n in R n, and hence, in all tangent spaces T x R n. It induces the dual orthonormal basis in all co-tangent spaces (T x R n ) : consider the differential dx k of the k-th coordinate function x x k in R n, if ξ T x R n, then dx k (ξ) = n i=1 x k x i ξ i = ξ k. 28 For traditional reason, we say here differential, not derivative, and write df. 145

146 That is, dx 1,..., dx n is the orthonormal basis in (T x R n ) dual to e 1,..., e n. Since n f n f df x (ξ) = ξ i = dx i (ξ), x i x i i=1 we see that the expansion of the differential df x in this bases is i=1 df x = n i=1 f x i dx i. Definition The linear differential form is a mapping x ω x (T x R n ). We usually assume that the linear form ω x is a C 1 -function of x. In the coordinates, ω x (ξ) = ω x ( n i=1 ξ i e i ) = n ω x (e i )ξ i. i=1 Introduce the functions a i (x) def = ω x (e i ) (the coefficients of ω x ); if ω x is a C 1 -form, then the coefficients are C 1 -functions of x (and vice versa!). Thus ω x = n a i (x) dx i. i=1 If ω = df, then we call f the primitive function, and ω the differential. The first natural question: does any differential form has a primitive function? In the one-dimensional situation (n = 1) this is true: ω x (ξ) = a(x) ξ, that is ω x = df, where f = a is a primitive function to a. Consider the two-dimensional case. If ω = df = a(x, y)dx + b(x, y)dy, then a y = 2 f x y = b x. That is, we get a necessary condition for the differential form to have a primitive function: a y = b x We shall see a bit later, that generally speaking, this condition is not a sufficient one, though in the discs and in the whole R 2 it is sufficient. 146

147 Examples Compute the action of the linear forms( ω) 1 = dx 1, ω 2 = ( x 1 dx) 2, ω 3 = dr 2 1 (r 2 = x 2 + y 2 ) in R 2 on the vectors ξ 1 = T 1 R 2, ξ 2 = T 1 (2,2) R 2, ( ) 1 and ξ 3 = T 1 (2,2) R 2. The results are given in the following table Exercise Compute ω x (ξ) if ξ 1 ξ 2 ξ 3 ω ω ω ω = x 2 dx 1 is the differential form in R 3, and ξ = (1, 2, 3) T (3,2,1) R 3. Answer: ω(ξ) = ω = df is the differential form in R n, f = x 1 + 2x nx n, ξ = (+1, 1, +1,..., ( 1) n 1 ) T x R n, x = (1, 1,..., 1). Answer: { m n = 2m, ω(ξ) = m + 1 n = 2m Line integrals U R n domain, ω x differential form on U, γ : I U piece-wise C 1 curve. Definition 16.4 (line integral). ω def = γ I ω γ(t) ( γ(t)) dt. In the chosen coordinates, the integral equals ( ai (γ(t))) dt = ai (x) dx i. I γ 147

148 Examples 1. ω = z dx + x dy + y dz, γ(t) = (cos t, sin t, t), t 2π spiral in R 3, sin t γ(t) = cos t. Then 1 ω γ(t) ( γ(t)) = t ( sin t) + cos t cos t + sin t 1, and ω = γ = 2π 2π ( t sin t + cos 2 t + sin t ) dt t d cos t + 2π 1 + cos 2t 2 dt = 2π + 2π 1 2 = 3π. Exercise Find ω for ω = x dy y dx, and ω = x dy + y dx. The γ curve γ connects the origin O with the point (1, 1): γ is a segment, γ is a part of the parabola {y = x 2 : x 1}, γ is a union of two segments: the vertical one going from (, ) to (, 1), and the horizontal one going from (, 1) to (1, 1). Exercise Compute γ x dx + y dy + z dz x2 + y 2 + z 2 where the path γ starts at the sphere x 2 + y 2 + z 2 = r 2 1 and terminates at the sphere x 2 + y 2 + z 2 = r 2 2. Exercise Find over the parabolic arc z dx + x dy + y dz x = a(1 t 2 ), y = b(1 t 2 ), z = t, starting at (,, 1) and ending at (,, 1). 2. Consider a smooth form on R 2 \ {} ω = y dx + x dy x 2 + y 2, 148

149 and integrate it over the unit circle: γ(t) = (cos t, sin t), t 2π. In this case, ω γ(t) ( γ(t)) = (sin t) ( sin t) + (cos t) (cos t) = 1, so that γ ω = 2π 1 = 2π. Let s have a closer look at this example which is nothing but a twodimensional version of the Gauss integral (why?). Consider the polar angle function θ(x, y) = arctan y x. If (x, y), then θ x = y 2 /x 2 ( y x 2 ) = y x 2 + y 2, θ y = y 2 /x 2 ( ) 1 = x x x 2 + y 2, hence dθ = ω. Clearly, the function θ cannot be defined continuously in the whole R 2 \ {}. In this example, ω = a dx + b dy, and the necessary condition a y = b x for the form to have a primitive function holds. On the other hand, ω does not have a primitive in R 2 \ {} by the following version of the Newton-Leibnitz formula: b d df = (f γ) (t) dt = f(γ(b)) f(γ(a)). dt γ a In particular, if the curve γ is closed (i.e. γ(a) = γ(b)), then df =. γ In our case, we know that the integral of ω over the circle does not vanish! Properties of line integrals The definition does not depend on the choice of parameterization of the curve γ. Indeed, suppose γ : I R n, µ = γ ϕ another parameterization (ϕ: J I C 1 -smooth, orientation preserving bijection). Then, after the change of 149

150 variables t = ϕ(s), we get ω = ω γ(t) ( γ(t)) dt = ω µ(s) ( γ(ϕ(s))) ϕ(s) ds γ I J = ω µ(s) ( γ(ϕ(s)) ϕ(s)) ds = ω µ(s) ( µ(s)) ds. J If γ the curve traversed in the opposite direction, then ω = ω. (Prove this!) γ Suppose the starting point of the curve γ 2 coincides with the terminating point of the curve γ 1. Denote the composite curve by γ 1 + γ 2. Then ω = γ 1 +γ 2 ω + γ 1 ω. γ 2 Estimate of the line integral Exercise Suppose γ : [, 1] R n is a piece-wise C 1 -path, ω is a continuous linear form on U γ[, 1]. Then ω sup ω x L(γ). x γ[,1] Polygonal approximation of line integrals. γ Exercise Suppose γ : [, 1] R n is a piece-wise C 1 -path, ω is a continuous linear form on U γ[, 1]. Then, for any ɛ >, there exists a polygonal line ν : [, 1] U, ν() = γ(), ν(1) = γ(1), such that max ν(t) γ(t) < ɛ, t [,1] γ J and ω γ ν ω < ɛ. 15

151 Claim Suppose ω is a differential form in U R n with continuous coefficients. Then TFAE: (i) for any closed curve γ, γ ω = ; (ii) if the curves γ and µ have the same beginning and the same end, then γ ω = µ ω; (iii) there exists a C 1 -function f C 1 (U) such that df = ω. We shall prove only (ii) (iii), the rest is obvious. Fix p U, and set f(x) = ω, where the path γ starts at p and terminates at x. The function f is welldefined by (ii). Then γ f(x 1 + ɛ, x 2,..., x n ) f(x 1, x 2,..., x n ɛ = 1 ɛ x1 +ɛ x 1 a 1 (t, x 2,..., x n ) dt (ω = a i dx i ). Thus, x1 f = a 1, and similarly, for all i, xi f = a i. Clearly, f C 1 (U). The next statement is deeper than the previous ones: Poincaré Lemma If U R n is a ball (or the whole R n ), then condition a i x k = a k x i, 1 i, k n ( ) is equivalent to any of conditions (i)-(iii) from Claim The forms satisfying condition ( ) are called closed. That is, in balls and in the whole R n any closed form is a differential of a function 29. We will prove later, that the same result holds in arbitrary simply connected domains in R n. Proof: it is not difficult to guess the primitive function. WLOG, suppose that ω = a i dx i is an exact form in the unit ball B R n. Set g(x) def = 1 n x i a i (tx) dt = i=1 n 1 x i a i (tx) dt. i=1 29 such forms are called exact ones 151

152 Then g x k = ( ) = = = a k (tx) dt + a k (tx) dt + a k (tx) dt + n 1 a i x i t dt x k i=1 n 1 x i i=1 1 a k (tx) dt + ta k (tx) t a k x i dt t d dt a k(tx) dt t=1 t= 1 a k (tx) dt = a k (x). If you start to feel that you ve learnt something very similar in the Complex Analysis course, then you are right. If f = u + iv is a complex-valued function, then f dz = u dx v dy + i(v dx + u dy). Exercise Deduce form the Poincaré lemma the Cauchy theorem: if f = u + iv is analytic function in a disc D (that is, f is a C 1 -function, and its real and imaginary parts u and v satisfy the Cauchy-Riemann equations u x = v y, u y = v x ), then for any closed contour γ D, f(z) dz =. γ Hint: find out when the complex-valued differential form f dz is closed. Given a closed form in a disc (or in R n ) it is easy to find its differential by integration: Example Let ω = y dx + x dy + 4dz. Conditions ( ) are satisfied. We are looking for the primitive function f. f x = y f(x, y, z) = xy + f 1(y, z), f y = x x + f 1 y = x f 1 = f 1 (z), f z = 4 f 1(z) = 4z + Const. Thus the primitive function is f(x, y, z) = xy + 4z + Const. 152

153 Exercise Check which of the following linear differential forms has a primitive function. If the primitive exists, find it. (4x 3 y 3 2y 2 ) dx + (3x 4 y 2 2xy) dy, ((x + y + 1)e x e y ) dx + (e x (x + y + 1)e y ) dy Vector fields and differential forms There is a simple duality between the differential forms and the vector fields: if F is a vector field in U R n, then the work form ω F is defined as ω F (ξ) = F (x), ξ, ξ T x R n. Having the form ω, the same formula we recovers the field F. The integral of ω F over the curve γ U gives us expression for the work done by the field F for transportation of a particle along γ: ω F = F (γ(t)), γ(t) dt. γ I The vector field F is called potential field or gradient field if there exists a function U (called potential) such that F = U. Equivalently, the work form ω F = du. For example, the Coulomb and gravitational fields are potential ones. The field F is called conservative if the work done by F in moving a particle along any loop γ U is zero. Equivalently, the work done by F in moving a particle from the point x to the point y depends only on x and y, and does not depend on the path. In virtue of Claim 16.1, the notions of potential and conservative fields coincide. e x cos y + yz Exercise Check that the vector field in R n F = xz e x sin y is xy + z conservative, and find its potential. Exercise The vector field F is called central-symmetric (with respect to the origin), if the size of the field is a radial function: F (x) = f( x ), and the direction of the field coincides with the one of the point-vector, i.e. F (x) = f( x ) x x. Prove that any central-symmetric field in R n is a potential one. Find the radial potential U(r), r = x, in terms of f. 153

154 16.5 The arc-length form ds Now, we have two notions of integrals over the curves: we learnt non-oriented integrals of functions (as a special case of surface integrals) and oriented integrals of differential forms. How these two notions are related to each other? Suppose γ : I R n smooth path, T (x) = γ(t) unit tangent vector to γ at x = γ(t) (Check that it does not γ(t) depend on the choice of the parameterization of γ!) Definition (arc-length differential form). ds x (ξ) = T (x), ξ, ξ T x R n. Warning: in spite of (traditional) notation ds, this is not a differential! If ρ is a continuous function on the curve γ, then we can define a new form ρ ds which has a density ρ with respect to the form ds; i.e. Then oriented γ (ρ ds)(ξ) = ρ(x) ds x (ξ), x = γ(t). ρ ds = = = that is our definitions coincide! I I I ρ(γ(t)) (ds) γ(t) ( γ(t)) dt ρ(γ(t)) T (γ(t)), γ(t) dt ρ(γ(t)) γ(t) dt = non oriented γ ρ ds, Question The definition of the integral of a function clearly does not depend on the orientation of the curve, the definition of the integral of a form does depend on the orientation. How this could happen? To finish this discussion, observe that any line integral can be rewritten as an arc-length integral : given a form ω x (T x R n ), take the vector field F (x) T x R n such that ω = ω F. Then ω = F (γ(t)), γ(t) dt γ I = F (γ(t)), T (γ(t)) γ(t) dt I = F, T ds, γ 154

155 that is ω F = F, T ds. γ γ 155

156 17 Green s theorem Theorem Suppose G R 2 is a domain whose boundary Γ = G consists of finitely many piece-wise C 1 -curves, and is positively oriented 3. Suppose P dx + Qdy is a C 1 -differential form on G. Then ( Q P dx + Q dy = x P ) dxdy. y Γ G As in the Divergence Theorem, the integral on the left-hand side is an additive set-function. It s easy to compute its density : Exercise Let S be a square centered at the point (ξ, η), and Γ be its boundary with the positive orientation. Check that 1 lim P dx + Q dy = Q P (ξ, η) (ξ, η). S (ξ,η) Area(S) x y Γ Then using the properties of the line integrals we know already, it is not difficult to complete the proof Green s Theorem, approximating from inside the domain G by the connected union of the squares. However, we shall not do this. We show that Green s Theorem is a simply corollary to the Divergence Theorem. To this end, consider the unit normal field to a two-dimensional curve γ : I R 2 : N(γ(t)) = 1 γ ( γ2 γ 1 (This definition does not depend on parameterization of γ. Check!) With this definition, the normal N lies to the right to the tangent T. Thus if Γ = γ(i) is the oriented boundary of G, then N is the outer normal to Γ. ( ) F1 Now, let F = be a smooth vector field on G. Consider the linear form F 2 ω = F 2 dx + F 1 dy (Warning: this is not the work form we defined earlier!) Then ω = [ F 2 (γ(t)) γ 1 (t) + F 1 (γ(t)) γ 2 (t)] dt γ I = F (γ(t)), N(γ(t)) γ(t) dt = F, N ds. I 3 This means that if one traverses the boundary in the positive direction, then his/her left foot is within the domain ( The Law of the Left Foot ). 156 ) γ

157 We can go in the( opposite ) direction: if ω = P dx + Q dy is a differential form Q in R 2, then F = is the corresponding vector field. Thus P P dx + Q dy = F, N ds Γ Γ ( Gauss Q = div(f ) dxdy = x P ) dxdy. y G We see that Green s Theorem is equivalent to the two-dimensional version of the Divergence Theorem 31. Exercise C is the unit circle with positive orientation. Find e x2 y 2 (cos 2xy dx + sin 2xy dy) C Exercise Γ is the boundary of the square [, π] [, π] with natural orientation. Find ( cos x cos y + 3 x2) dx + ( sin x sin y + (y 4 + 1) 1/4) dy. Γ 31 In fact, its proof for domains with piece-wise smooth boundaries is essentially simpler than the proof of the Divergence Theorem we gave. 157 G

Analysis-3 lecture schemes

Analysis-3 lecture schemes (with Homeworks) 1 Csörgő István November, 2015 1 A jegyzet az ELTE Informatikai Kar 2015. évi Jegyzetpályázatának támogatásával készült Contents 1. Lesson 1 4 1.1. The Space