EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 While these notes are under construction, I expect there will be many typos. The main reference for this is volume 1 of Hörmander, The analysis of liner partial differential equations. I have picked a few of the most useful and concrete highlights. The references are based on the 1989 hardcover second edition. 1. Generalities (from Ch. 2 and 3) Definition 1.1. Let U be an open set in R n. A distribution u D (U) is a linear function u : C0 (U) C. One can write u(φ) =< u, φ > and think of this, informally, as u(φ) = uφ. It is required that u is continuous in the following sense: For every K U compact there exist C, k such that u(φ) C sup α φ (1) x α k for every φ C 0 (U) supported in K. If one k works for all K, u is of finite order. The smallest such k is the order of u. We will need an equivalent formulation of the continuity condition. Definition 1.2. Let φ j, φ C 0 (U). The sequence φ j φ in C 0 (U) if there exists a compact subset of U which contains the support of all φ j, φ and for every fixed α, sup x α (φ j (x) φ(x)) 0 as j. Theorem 1.3. A linear function u : C 0 (U) C is a distribution if and only if u(φ j ) u(φ) for every φ j φ in C 0 (U). Proof. To show that if u is a distribution, then u(φ j ) u(φ) for every φ j φ in C0 (U) is clear from the definition. The other half is an easy exercise in negations. Examples: (1) If ũ is a locally integrable function, u(φ) := ũφ. This identifies the function ũ with a distribution u. (2) Dirac delta function. δ a (φ) = φ(a) 1
2 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 (3) Weak derivatives: If u is a smooth function, and φ C0 is a test function, α = (α 1,, α n ) is a multi-index and α u = α 1 x 1 x αn n u, then < α u, φ >:= α uφ = ( 1) α u α φ = ( 1) α < u, α φ >. (integration by parts). This motivates the definition of α u for any distribution u: < α u, φ >= ( 1) α < u, α φ >. (4) It takes some work (thm. 4.4.7 in Hörmander) and we will not prove this, but the above essentially accounts for all possible distributions: If u D (U) then there exists a locally finite family of continuous functions f α (each compact subset of U intersects only finitely many of the supports of the f α s) such that u = α α f α Definition 1.4. A sequence of distributions u i converges to u in D (U) (or in the sense of distribution theory) if u i (φ) u(φ) for every φ C 0 (U) Also, if u i D (U) and for each fixed φ C 0 (U) the limit u i (φ) exists and is denoted u(φ), then u is automatically a distribution. See Theorem 2.1.8. We will not prove this. Definition 1.5. Let u D(U) and f C (U). Then the distributions and fu are defined by ( ) u u x k x k ) ( (φ) = u x k (fu) (φ) = u(fφ) Unlike classical convergence, if u i u in D (U), then α u i α u in D (U) is trivial. Example 1: Let H be the Heavyside function. Then H = δ 0. The following will be worked out in class: If E is the fundamental solution of the Laplace operator, E in the sense of distributions agrees with the locally integrable function E defined for x 0, but E in the sense of distributions does not agree with the locally integrable function E = 0 defined for x 0. In fact E = δ 0. Definition 1.6. A distribution u is defined to be 0 in an open set V U if u(φ) = 0 for every φ C 0 (V ). The union of all such subsets V is the largest open set where u is 0, and the complement of that is defined to be the support of u.
EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 3 Thus the support of a distribution u D(U) is always (relatively) closed in U. If the support of u is compact, u is called compactly supported. The set of compactly supported distributions in U is denoted by E (U) Recall the support of a function φ is the closure of the set {φ(x) 0}. If u R n and φ C0 (R n ), and the support of φ and u are disjoint, then u(φ) = 0. However, if φ is zero on the support of u, it does not follow that u(φ) = 0. Example: δ (x). If u E (U), u(φ) is well defined for φ C : Let K be the support of u, K V U with V open. There exists a smooth cut-off function ζ C0 (U), and ζ = 1 in V. Then u(ζφ) is well-defined, and is independent of the choice of ζ. Define u(φ) = u(ζφ) for ζ as above. Definition 1.7. A distribution u is defined to be smooth in an open set V U if there exists ũ C (V ) such that u(φ) = ũ(x)φ(x)dx for all φ C 0 (V ) The union of all such subsets V is the largest open set where u is smooth, and the complement of that is defined to be the singular support of u. The major goal of the next section is to prove 1) If f E (R n ), then there exists u D (R n ) such that u = f. 2) If u, f are as above, and f C (V ) for some open set V, then u C (V ). Both of these goals follow from the properties of the convolution of a distribution with a compactly supported distribution. Part 1 follows by writing u = E f, u = ( E) f = δ f = f, but we have to assign rigorous meaning to this. Part 2 follows from the fact that the fundamental solution E is C away from 0. The exact same results hold for 2 but not. t t 2 2. Convolutions (Chapter 4 in Hörmander s book) Definition 2.1. If u D (R n ) and φ C 0 (R n ), u φ(x) = u(φ(x )) (where stands for y, and u acts in the y variable) Check u φ C, α (u φ)(x) = ( α u) φ = u ( α φ)(x): We have φ(x y + ɛe i ) φ(x y) = x i φ(x y) + R(x y, ɛ)
4 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 where 1 d 2 R(x y, ɛ) = 0 dt (φ(x y + tɛe i)) (1 t)dt 2 1 ( ) = ɛ 2 2 φ (x y + tɛe i )(1 t)dt 0 x 2 i Fix x. R(x y, ɛ) is in C 0, and sup y α y R(x y, ɛ) C α ɛ 2. Using the continuity condition (1) we see and R(x, ɛ) lim ɛ 0 ɛ u (φ(x + ɛe i )) u (φ(x )) lim ɛ 0 ɛ = 0 = u( x i φ(x )) = u x i (φ(x ) Check support(u φ) support u + support φ: Fix x. If φ(x ) is supported in the complement of support u, then u(φ(x ) = 0 by the definition of support u. If u(φ(x ) 0, then y support u and y support φ(x ). Thus y = lim y i with φ(x y i ) 0, and x = lim (x y i + y) support φ + support u = support φ + support u because support u is compact. We also have Theorem 2.2. Let u D (R n ), and φ, ψ C 0 (R n ). Then (u φ) ψ = u (φ ψ). Proof. Before starting the proof, review Definition (1.2). u φ C. Fix x. (u φ) ψ(x) = (u φ)(x y)ψ(y)dy = lim (u φ)(x kh)ψ(kh)h n h 0+ k Z ( n ) = lim u φ(x kh )ψ(kh)h n h 0+ k Z ( n ) = u φ(x z )ψ(z)dz In the last line, we used the (obvious) fact that, for x fixed, φ(x kh y)ψ(kh)h n φ(x z y)ψ(z)dz k Z n
EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 5 uniformly in y, and the same is true for after differentiating with respect to y an arbitrary number of times. Also, both LHS and RHS are supported in a fixed compact set. In other words, LHS RHS in C0. This implies the important theorem on approximating distributions by C functions. Theorem 2.3. Let u D (R n ), and let η ɛ be the standard mollifier. Then u η ɛ C (R n ) and u η ɛ u in the sense of distribution theory (as ɛ 0). Proof. We have to check (u η ɛ )(φ) u(φ) for every φ C 0 (R n ). The proof is based on the observation that u(φ) = u φ (0) where φ (x) = φ( x). So it suffices to show (u η ɛ ) φ(0) u φ(0). But since η ɛ φ φ in C 0. (u η ɛ ) φ(0) = u (η ɛ φ)(0) u φ(0) Now we define the convolution of two distribution u 1, u 2, one of which is compactly supported. This is defined so that the formula (u 1 u 2 ) φ = u 1 (u 2 φ) holds for all φ C 0 (R n ). For simplicity, let s assume u 2 is compactly supported. Instead of defining (u 1 u 2 )(φ) it suffices to define (u 1 u 2 ) φ(0). This is done in the obvious way: (u 1 u 2 ) φ(0) = u 1 (u 2 φ)(0) We have to check that u 1 u 2 satisfies the continuity condition. Let φ j 0 in C 0 (R n ) (see Definition (1.2)). Then so does u 2 φ j, and u 1 (u 2 φ j )(0) 0. Also, it τ h denotes a translation, (τ h φ)(x) = φ(x+h), then τ h (u φ) = u (τ h φ) and (u 1 u 2 ) φ(h) = τ h ((u 1 u 2 ) φ) (0) = ((u 1 u 2 ) τ h φ) (0) Proposition 2.4. Let u 1, u 2 supported. Then = u 1 (u 2 τ h φ)(0) = u 1 (τ h (u 2 φ)) (0) = u 1 (u 2 φ) (h) D (R n ), one of which is compactly support (u 1 u 2 ) support u 1 + support u 2
6 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 Proof. Let η ɛ be a standard mollifier supported in a ball or radius ɛ. It suffices to show support (u 1 u 2 ) support u 1 + support u 2 + support η ɛ0 for all ɛ 0 > 0. We do know support (u 1 u 2 η ɛ ) support u 1 + support u 2 + support η ɛ support u 1 + support u 2 + support η ɛ0 for all 0 < ɛ < ɛ 0. Also remark that if A is closed and u is a distribution such that support u η ɛ A for all ɛ 0 > ɛ > 0, then support u A. This amounts to showing that if u η ɛ = 0 in A c, then u = 0 in A c, which follows from u η ɛ u in the sense of distributions. Theorem 2.5. Let u 1, u 2, u 3 distributions in R n, two of which are compactly supported. Then (u 1 u 2 ) u 3 = u 1 (u 2 u 3 ) Proof. The proof follows by noticing it suffices to check ((u 1 u 2 ) u 3 ) φ = (u 1 (u 2 u 3 )) φ for every φ C0 (R n ) which follows easily from the defining property of Theorem (2.2). Theorem 2.6. Let u 1, u 2 D (R n ), one of which is compactly supported. Then u 1 u 2 = u 2 u 1 Proof. The strategy is to show that (u 1 u 2 ) (φ ψ) = (u 2 u 1 ) (φ ψ) for all test functions φ, ψ. This is done using the associativity property Theorem (2.2) together with the fact that convolutions of functions is commutative. We will not prove this Theorem 2.7. Let u 1, u 2 D (R n ), one of which is compactly supported. Then α (u 1 u 2 ) = ( α u 1 ) u 2 = u 1 α u 2 (2) Proof. We already know α (u φ) = ( α u) φ = u ( α φ), so the theorem is proved by convolving (2) with φ. Theorem 2.8. Let u 1, u 2 D (R n ), one of which is compactly supported. Then sing support (u 1 u 2 ) sing support u 1 + sing support u 2 Proof. The proof is based on the fact that if one of u 1, u 2 is smooth, so is u 1 u 2. Let χ 1, χ 2 be supported in small neighborhoods of sing support u 1, sing support u 2, so that (1 χ 1 )u 1 and (1 χ 2 )u 2 are smooth. Then sing support (u 1 u 2 ) sing support (χ 1 u 1 ) (χ 2 u 2 ) support χ 1 u 1 + support χ 2 u 2
EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 7 Now we come back to PDEs. Let P (D) be a constant coefficient differential operator. A distribution E D (R n ) is called a fundamental solution if P (D)E = δ. We already know formulas for (the) fundamental solution of the Laplace and heat operators. We will write down later several fundamental solutions of the wave operator. Theorem 2.9. If sing support (E) = {0}, U is open and u D (U) is such that P (D)u C (U), then u C (U) Proof. Let V U an arbitrary open subset. It suffices to show u C (V ). Let ζ C 0 (U), ζ = 1 on V. Then P (D)(ζu) = P (D)u in V, and in particular is C there. Finally, and therefore ζu = ζu δ = ζu P (D)(E) = (P (D)(ζu)) E sing support (ζu) sing support (P (D)(ζu)) + {0} = sing support (P (D)(ζu)) But we know that sing support (P (D)(ζu)) is disjoint from V, so sing support (ζu) is also disjoint from V, in other words ζu, which equals u in V, is smooth there. 3. The Fourier transform Definition 3.1. The space of Schwartz functions S is defined by the requirement that all semi-norms sup x α β f x be finite. Convergence in this space means for all α, β. sup x α β (f n f) 0 x The Fourier transform F(f) = ˆf is defined by ˆf(ξ) = e ix ξ f(x)dx R n The following are elementary properties which will be checked in class:
8 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 Lemma 3.2. Let f S, denote f λ (x) = f(λx) (λ > 0), τ y f(x) = f(x + y) (y R n ) and D j = 1 i x j. Then ˆf S and f ˆf is continuous in the topology of S. Also, ˆf λ (ξ) = 1 λ n ˆf( ξ λ ) F(τ y f)(ξ) = e iy ξ ˆf(ξ) F(D j f)(ξ) = ξ j ˆf(ξ) F(x j f)(ξ) = D j ˆf(ξ) ( e x 2 2 F fĥ = ) (ξ) = (2π) n/2 e ξ 2 2 ˆfg F (f g) = ˆfĝ for all ˆf, ĥ S These easily imply the inversion formula and Plancherel formulas, which will be proved in class. Theorem 3.3. Let f S. Then f(x) = 1 e ix ξ ˆf(ξ)dξ (2π) n R n Also, f(x)g(x)dx = 1 ˆf(ξ)ĝ(ξ)dξ R (2π) n n Definition 3.4. The space of continuous linear functionals u : S C is the space of tempered distributions S. u S if and only if there exists N and C such that < u, φ > C sup x α β (f) x α, β N for all φ S. If u S, then û S is defined by for all φ S. R n < û, φ >=< u, ˆφ > Example: The constant function 1 S and ˆ1 = (2π) n δ.
EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 9 4. Integrating functions on a k-dimensional hypersurface in R n This section uses geometric notation: coodinates are written x i. Let S be a compact C 1 k-dimensional hypersurface in R n. We will integrate f : S R, continuous. Each point in S has a neighborhood (a ball B ) such that S B can be parametrized: There exists P : C S B R n where C is open in R k and P is one-to-one and onto. We assume P is C 1 and the n vectors P i are linearly independent. This insures S is a C 1 hypersurface. S can be covered by finitely many such balls B ri (x i ). Before anything else, we break up f as a finite sum of continuous functions f i, each supported in one such B. For convenience and without loss of generality, we assume f has been extended as a continuous function to R n, and is supported in the union of the B ri (x i ). Theorem 4.1. Let f : R n R be a C l function supported in a finite union of k balls k i=1b ri (x i ). Then there exist C l functions f i supported in B ri (x i ) such that f = f i. Proof. First we argue that there are s i < r i such that the support of f is covered by B si (x i ). Indeed, the infinite union si <r i B si (x i ) cover the compact support of f, so finitely many also do. Let χ i be C l functions supported in B ri (x i ), χ i = 1 on B si (x i ). (It is easy to construct such functions). Write so 0 = f(x)(1 χ 1 (x))(1 χ 2 (x)) (1 χ k (x)) f(x) = f(x)χ 1 (x) + f(x)(1 χ 1 (x))χ 2 (x) + f(x)(1 χ 1 (x))(1 χ 2 (x))χ 3 (x) + f(x)(1 χ 1 (x))(1 χ 2 (x)) χ k (x) =: f 1 (x) + f 2 (x) + f k (x) We proceed to integrate one of the f i s, written as f from now on. The Euclidean metric in R n induces a Riemannian metric on S. With respect to there coordinates, g ij (x) is defined as the Euclidean inner
10 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 product of the push-forward of the usual basis vectors in R k : In matrix notation, g ij (x) = P (x) x i P (x) x j (g ij (x)) = (DP (x)) T DP (x) The standard Riemannian geometry definition of fd V ol is S fd V ol = f (P (x)) det(g ij (x)) dx 1 dx k S C An explicit calculation shows this is independent of the parametrization. It agrees with familiar formulas in low dimensions: Example 1: k = 1 (curves in R n ). Here P : (a, b) R n, P (x) 0 for all x, and g 11 (x) = P (x) 2, so b fds = f(p (x) F (x) dx S Example 2: k = 2, n = 3 (surfaces in R 3 ). Here DP (x) has two columns, P and P. x 1 x 2 Exercise (using a suitable rotation) det(g ij ) = P x P 1 x 2 so this is consistent with the Math 241 formula fds = f(p (x)) P x P 1 x 2 dx1 dx 2 S C a Example 3: k = n. Here (g ij ) = (DP (x)) T DP (x), det(g ij ) = det(dp ) 2 so we get the change-of-variables formula fdx = f(p (x)) det DP (x) dx S C Example 4 (what we need to prove the divergence theorem) Let k = n 1 and S given (locally) as the graph of a function r: 2 Then P (x 1, x n 1 ) = (x 1, x n 1, r(x 1, x n 1 ))
EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 11 (g ij (x)) = 1 0 x 1 0 1 x n 1 = I (n 1) (n 1) + x 1 x 2 x n 1 ( x 1 1 0 0 0 1 0 0 0 1 x 1 x 2 x 2 x n 1 ) x n 1 Exercise: det(g ij ) = 1 + r 2. So fds = f(x 1,, r(x 1, x n 1 )) 1 + r 2 dx 1 dx n 1 S C 5. The gradient of a characteristic function This is background material (formula 3.1.5 in Hörmander s book). Theorem 5.1. Let U be an open set with C 1 boundary. Then χ U = νds where ds is surface measure on U and ν is the outward pointing unit normal. Proof. Let h : R R be a smoothed out Heaviside function: h(x) = 0 if x 0, h(x) = 1 if x 1 and smooth in-between. Using a partition of unity, it suffices the prove the theorem for test functions φ supported in a small neighborhood of x 0 U, where U agrees with x n > r(x 1, x n 1 ). Then
12 EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018 < χ U, φ >= < χ U, φ > = lim h( x n r(x 1, x n 1 ) ) φ(x 1,, x n )dx 1 dx n ɛ 0 ɛ ( = lim h( x ) n r(x 1, x n 1 ) ) φ(x 1,, x n ) ɛ 0 ɛ 1 = lim ɛ 0 R ɛ h ( x n r(x 1, x n 1 ) ) ( r(x 1,, x n 1 ), 1)φ(x)dx ɛ n ( 1 = ( r(x 1,, x n 1 ), 1) lim R n 1 ɛ 0 R ɛ h ( x ) n r(x 1, x n 1 ) )φ(x)dx n dx 1 dx n 1 ɛ = φ(x 1, x n 1, r(x 1, r x 1 ))( r(x 1,, x n 1 ), 1)dx 1 dx n 1 R n 1 = φνds U (by the Calculus formulas for ν and ds). We used the fact that 1 ɛ h ( x ɛ ) is an approximation to the identity.