Advanced Real Analysis

Similar documents
n 1 f = f m, m=0 n 1 k=0 Note that if n = 2, we essentially get example (i) (but for complex functions).

Fall f(x)g(x) dx. The starting place for the theory of Fourier series is that the family of functions {e inx } n= is orthonormal, that is

FOURIER TRANSFORMS. 1. Fourier series 1.1. The trigonometric system. The sequence of functions

Topics in Harmonic Analysis Lecture 1: The Fourier transform

1 Fourier Integrals on L 2 (R) and L 1 (R).

Outline of Fourier Series: Math 201B

Topics in Fourier analysis - Lecture 2.

HARMONIC ANALYSIS TERENCE TAO

Math 115 ( ) Yum-Tong Siu 1. Derivation of the Poisson Kernel by Fourier Series and Convolution

INTRODUCTION TO REAL ANALYSIS II MATH 4332 BLECHER NOTES

Bernstein s inequality and Nikolsky s inequality for R d

Fourier Series. ,..., e ixn ). Conversely, each 2π-periodic function φ : R n C induces a unique φ : T n C for which φ(e ix 1

1.1 Appearance of Fourier series

Real Analysis Problems

be the set of complex valued 2π-periodic functions f on R such that

Reminder Notes for the Course on Distribution Theory

1.5 Approximate Identities

Fourier Series. 1. Review of Linear Algebra

ELEMENTARY APPLICATIONS OF FOURIER ANALYSIS

SOLUTIONS TO HOMEWORK ASSIGNMENT 4

1.3.1 Definition and Basic Properties of Convolution

TOOLS FROM HARMONIC ANALYSIS

7: FOURIER SERIES STEVEN HEILMAN

17 The functional equation

Folland: Real Analysis, Chapter 8 Sébastien Picard

FOURIER SERIES, HAAR WAVELETS AND FAST FOURIER TRANSFORM

Chapter One. The Calderón-Zygmund Theory I: Ellipticity

Measurable functions are approximately nice, even if look terrible.

3. Fourier decomposition of functions

Mathematical Methods for Physics and Engineering

II. FOURIER TRANSFORM ON L 1 (R)

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

FOURIER SERIES WITH THE CONTINUOUS PRIMITIVE INTEGRAL

Math 172 Problem Set 5 Solutions

Notions such as convergent sequence and Cauchy sequence make sense for any metric space. Convergent Sequences are Cauchy

CHAPTER VI APPLICATIONS TO ANALYSIS

Math 172 Problem Set 8 Solutions

Math 489AB A Very Brief Intro to Fourier Series Fall 2008

Notes. 1 Fourier transform and L p spaces. March 9, For a function in f L 1 (R n ) define the Fourier transform. ˆf(ξ) = f(x)e 2πi x,ξ dx.

Solutions: Problem Set 4 Math 201B, Winter 2007

Sobolev spaces. May 18

Convergence of Fourier series

Chapter 5: Bases in Hilbert Spaces

Hilbert Spaces. Contents

Tools from Lebesgue integration

Functional Analysis I

u t = u p (t)q(x) = p(t) q(x) p (t) p(t) for some λ. = λ = q(x) q(x)

2 Infinite products and existence of compactly supported φ

A VERY BRIEF REVIEW OF MEASURE THEORY

LECTURE Fourier Transform theory

1 Math 241A-B Homework Problem List for F2015 and W2016

A glimpse of Fourier analysis

CONVERGENCE OF THE FOURIER SERIES

Measure and Integration: Solutions of CW2

1. If 1, ω, ω 2, -----, ω 9 are the 10 th roots of unity, then (1 + ω) (1 + ω 2 ) (1 + ω 9 ) is A) 1 B) 1 C) 10 D) 0

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

MAT 578 FUNCTIONAL ANALYSIS EXERCISES

ANALYSIS QUALIFYING EXAM FALL 2017: SOLUTIONS. 1 cos(nx) lim. n 2 x 2. g n (x) = 1 cos(nx) n 2 x 2. x 2.

1 Assignment 1: Nonlinear dynamics (due September

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

Vectors in Function Spaces

REAL AND COMPLEX ANALYSIS

Introduction to Fourier Analysis

Exercises to Applied Functional Analysis

8 Singular Integral Operators and L p -Regularity Theory

Metric Spaces and Topology

Recall that any inner product space V has an associated norm defined by

Indeed, the family is still orthogonal if we consider a complex valued inner product ( or an inner product on complex vector space)

1 Continuity Classes C m (Ω)

An introduction to some aspects of functional analysis

Continuity. Chapter 4

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Real Analysis Notes. Thomas Goller

EXPOSITORY NOTES ON DISTRIBUTION THEORY, FALL 2018

Syllabus Fourier analysis

The Hilbert transform

Continuity. Chapter 4

REAL ANALYSIS I HOMEWORK 4

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first

Chapter 3: Baire category and open mapping theorems

In this chapter we study elliptical PDEs. That is, PDEs of the form. 2 u = lots,

Analysis Qualifying Exam

Harmonic Analysis: from Fourier to Haar. María Cristina Pereyra Lesley A. Ward

Fourier transforms, I

MTH 503: Functional Analysis

L p Spaces and Convexity

Fourier Transform & Sobolev Spaces

Overview of normed linear spaces

MATHS 730 FC Lecture Notes March 5, Introduction

From now on, we will represent a metric space with (X, d). Here are some examples: i=1 (x i y i ) p ) 1 p, p 1.

3: THE SHANNON SAMPLING THEOREM

Synopsis of Complex Analysis. Ryan D. Reece

A REVIEW OF RESIDUES AND INTEGRATION A PROCEDURAL APPROACH

We are going to discuss what it means for a sequence to converge in three stages: First, we define what it means for a sequence to converge to zero

Green s Functions and Distributions

TOPICS IN FOURIER ANALYSIS-III. Contents

Complex Analysis, Stein and Shakarchi Meromorphic Functions and the Logarithm

FRAMES AND TIME-FREQUENCY ANALYSIS

Harmonic Analysis on the Cube and Parseval s Identity

Differentiation and function spaces

Transcription:

Advanced Real Analysis Edited and typeset by David McCormick Based upon lectures by José Luis Rodrigo University of Warwick Autumn

Preface These notes are primarily based on the lectures for the course MA4J Advanced Real Analysis, given in the autumn of by Dr José Luis Rodrigo at the University of Warwick. I have embellished the material as lectured to include slightly more detail and exposition, and in one or two places I have reordered things a little to make the text flow a bit better. In addition, I have included the material in José s handwritten notes which was not lectured. These sections are marked with a star both in the title and the contents page, and are not examinable (at least in /). In particular, section.8 on the Lebesgue differentiation theorem, section 3.5 on the Fourier transform of tempered distributions, section 3.6 on Sobolev spaces, part of section 3.7 on fundamental solutions, and all of section 4 on the Hilbert transform were not lectured, and are not examinable. I would very much appreciate being told of any errors or oddities in these notes, the responsibility for which is mine alone and no reflection on José s excellent lectures. Any corrections may be sent by email to d.s.mccormick@warwick.ac.uk; please be sure to include the version number and date given below. David McCormick, University of Warwick, Coventry Version. of January 7, Edition History Version. January 7, Initial release for proofreading. 3

4

Contents Preface 3 Introduction 7 Fourier Series 9. Basic Definitions and the Dirichlet Kernel................. 9. Convergence and Divergence..........................3 Good Kernels and PDEs............................4 Cesàro Summation and the Fejér Kernel...................5 Abel Summation and the Poisson Kernel.................. 6 Fourier Transform 3. Definition and Basic Properties....................... 3. Schwartz Space and the Fourier Transform................. 35.3 Extending the Fourier Transform to L p ( )................ 4.4 Kernels and PDEs.............................. 47.5 Approximations to the Identity....................... 5.6 Weak L p Spaces................................ 57.7 Maximal Functions and Almost Everywhere Convergence......... 63.8 The Lebesgue Differentiation Theorem*................... 7 3 Distribution Theory 73 3. Weak Derivatives............................... 73 3. Distributions: Basic Definitions....................... 75 3.3 Distributional Derivatives and Products.................. 79 3.4 Distributions of Compact Support, Tensor Products and Convolutions.. 8 3.5 Fourier Transform of Tempered Distributions*............... 8 3.6 Sobolev Spaces*................................ 8 3.7 Fundamental Solutions............................ 8 4 Hilbert Transform* 83 References 85 5

6

Introduction Just over two hundred years ago, Joseph Fourier revolutionised the world of mathematics by writing down a solution to the heat equation by means of decomposing a function into a sum of sines and cosines. The need to understand these so-called Fourier series gave birth to analysis as we know it today: what s amazing is that the process of understanding Fourier series goes on, and Fourier analysis is still a fruitful area of research. In this course we aim to give an introduction to the classical theory of Fourier analysis. There are four chapters, which cover Fourier series, the Fourier transform, distribution theory, and the Hilbert transform respectively. (Note that the starred sections are not examinable in /.) Some of the principal questions which serve as motivation for the study of Fourier analysis are as follows: Fourier Series Let f : [, π] R be a π-periodic function, and let its n th Fourier coefficient be given by ˆf(n) := π f(y)e iny dy. π Can you recover f as N ˆf(n)e n= N inx? conditions? Fourier Transform ˆf : R C by If so, how does it converge, and under what Let f : R R be any function, and define its Fourier transform ˆf(y) := π + f(x)e iyx dx. How, if possible, can we rebuild f from ˆf? Does T N f(x) := + ˆf(s)χ [ N,N] (s)e ixs ds converge to f(x)? If so, how does it converge, and under what conditions? Whether T N f converges or not depends on the dimension; C. Fefferman won a Fields medal for that discovery. Distributions In order to study Fourier series and Fourier transforms in full generality, we will need tools from the theory of distributions. For example, if we try and take the Fourier transform of f(x) = e ix, we get that ˆf(y) = π + e ix e iyx dx, so that ˆf(y) = for y, and ˆf() = +, but in such a way that ˆf(y) dy =. Such an ˆf isn t really a function: we thus need to generalise the notion of function to a distribution. The central ideas of this course are all linked by the Hilbert trans- + f(y) Hf(x) = p. v. x y dy Hilbert Transform form: 7

which uses techniques from complex analysis, Fourier series, singular integrals, and PDEs (the solution of the Laplacian). Let us now show heuristically! how T n f is related to the Hilbert transform Hf introduced above: T N f(x) = = = = = = i + ˆf(s)χ [ N,N] (s)e ixs ds + ( + N ( + N + + = i i + ( N f(y) + ) f(y)e iys dy χ [ N,N] (s)e ixs ds f(y)e iys N ) rdy e ixs ds ) e i(x y)s ds dy f(y) i(x y) ei(x y)s N dy N f(y) ( e in(x y) e in(x y)) dy x y = H( f) f(y) sin N(x y) dy x y by Fubini if x y for some modification f of f. So as N for T N f, the result is linked to H f; that is, the convergence of T N f and H is linked! Books The sections on Fourier series, the Fourier transform and the Hilbert transform are based on the book of Duoandikoetxea [Duo], while the section on distribution theory is based on the book of Friedlander and Joshi [Fri&Jos]. Both books are readable yet clear introductions to the subject. For an introduction to Fourier series, Fourier transforms and their applications to differential equations, the books of Folland [Fol] and Stein and Shakarchi [Ste&Sha] are pitched below the level of the course and will be useful as background reading. For further reading in Fourier analysis, the books of Grafakos [GraCl] and [GraMo] are comprehensive yet very readable, and are highly recommended; alternatively, the classic books by Stein [SteHA] and [SteSI] are excellent reference works. For background in the functional analysis and distribution theory involved, Rudin s book [Rud] is a nice introduction to the subject (which includes topological vector spaces), while Yosida s book [Yos] is a comprehensive reference, though it is perhaps a little outdated now. 8

Fourier Series. Basic Definitions and the Dirichlet Kernel We will consider Fourier series on T := R/πZ; that is, a function f : T R is a π-periodic function on R (or on [ π, π] depending on your point of view). It will be convenient to abuse notation at various points and consider the domain of such functions to be [, π] or [ π, π] or similar, as appropriate. We also define C(T) to be the set of continuous functions on [ π, π] which are π-periodic, L (T) to be the set of L functions on [ π, π] which are π-periodic, and so on. Definition. (Fourier coefficients). Given f : T R, define the n th Fourier coefficient by ˆf(n) := π f(y)e iny dy. π π Define the basis functions e n (x) := e inx. Then the e n are orthogonal with respect to the L inner product: for f, g L (T) define their inner product by We thus observe that f, g := π π f(x)g(x) dx. ˆf(n) = π f, e n ; that is, ˆf(n) is the proejction of f onto e n. From this, we note that ˆf is naturally defined when f L (T), but that ˆf also makes sense when f L (T). Furthermore, since e iny, we have ˆf(n) π f L. Theorem. (Riemann Lebesgue lemma). If f L (T), then ˆf(n) as n ±. Proof. For a R, we define f a (x) := f(x a). Its Fourier coefficient is given by ˆf a (n) = π = π π π f a (y)e iny dy f(y a)e iny dy We now let z = y a; the limits of integration do not change as f is π-periodic, so we obtain: ˆf a (n) = π f(z)e in(z+a) dz π = e ina π π = e ina ˆf(n). π f(z)e inz dz 9

Now, choose a = π/n. Then e ina = ; for such a, we have ˆf(n) = ˆf(n) e ina ˆf(n) = ˆf(n) ˆf a (n) = π f(y)e iny dy π f(y a)e iny dy π π = π π (f(y) f(y a)) e iny dy π f(y) f(y a) dy. Note that a as n +. If f C (T), then f(y) f(y π ) for every y [, π]. Hence by the Dominated n Convergence Theorem, π lim f(y) f(y a) dy =. n + In general if f L (T), there exists g C (T) L (T) such that f g L < ε/. Take K large enough so that ĝ(k) < ε/ whenever k K; then ˆf(k) ˆf(k) ĝ(k) + ĝ(k) f g L + ε < ε. The fundamental question is whether we can recover f as a Fourier series. Define S N f(x) = N k= N ˆf(k)e ikx. Is it true that S N f(x) f(x) as N? In general, that is far too much to hope for. For certain kinds of f convergence is assured; but we will see that some functions f are so weird that S N f(x) diverges for every x T! In order to study the convergence of Fourier series, it will be helpful to rewrite S N f as a particular kind of integral known as a convolution. Notice that S N f(x) = = N k= N N k= N π = π = π π π ˆf(k)e ikx π f(y)e iky dy e ikx π π π f(y) = π f D N, N k= N e ik(x y) dy f(y)d N (x y) dy

where D N (t) = N k= N eikt is the Dirichlet kernel. Here f g means the convolution of f and g, which is defined as π π f g(x) = f(y)g(x y) dy }{{} = f(x y)g(y) dy. change of variables The Dirichlet kernel is a somewhat awkward sum; the following result shows that we can reduce it to a single quotient of sines: Lemma.3. D N (x) := Proof. We compute: D N (x) = N k= N e ikx = e inx N k= N k= N e ikx e ikx = sin(n + )x sin x. inx ei(n+)x = e using the geometric series formula e ix = e inx e i(n+)x e ix = e i(n+ )x e i(n+ )x e ix/ e ix/ = e i(n+ )x e i(n+ )x i e ix/ e ix/ i = sin(n + )x sin x. Before we move on, let us observe that π π D N (y) dy = π multiplying top and bottom by e ix/ N π k= N. Convergence and Divergence e iky dy = π. Using the Dirichlet kernel, we can go on to prove results about when the Fourier series S N f converges to f. The first three results rely only on the Riemann Lebesgue lemma and do not require any more complicated results. Theorem.4 (Convergence of S N f is local). Let f L (T). Suppose that f is in a neighbourhood of x; that is, there exists δ > such that f(y) = for all y (x δ, x + δ). Then S N f(x).

This result is significant in that while ˆf(n) depends on the values of f globally that is, to calculate ˆf(n) we need to know f everywhere the convergence of S N f(x) only depends on a local neighbourhood of x. Proof. We compute: S N f(x) = π = π = π = π π π π π f(y)d N (x y) dy f(x y)d N (y) dy [ π,π]\[ δ,δ] [ π,π]\[ δ,δ] f(x y)d N (y) dy f(x y) sin y sin(n + )y dy. Now, y f(x y) sin y is an L function on S := [ π, π] \ [ δ, δ], so S N f(x) = π π = π i π π χ S (y) π χ S (y) f(x y) e i(n+ )y e i(n+ )y sin y dy. i f(x y) sin y e iy/ e iny dy π π χ S (y) i π f(x y) sin y e iy/ e iny dy. Setting g(y) := χ S (y) f(x y) e iy/, and h(y) := χ sin y S (y) f(x y) e iy/, we see that the two terms sin y are nothing but Fourier coefficients of g and h, so that S N f(x) = ĝ( N) ĥ(n) π by the Riemann-Lebesgue lemma. The next result represents pretty much the minimal hypotheses you need to ensure that S N f(x) converges to f(x): Theorem.5 (Dini s convergence theorem). Let f L (T). Suppose that there exists δ > such that f(x t) f(x) t dt < +. Then S N f(x) f(x). Recall that if f C, then t <δ lim f(x t) f(x) t t exists and is bounded for all x; in which case, taking δ = π, f(x t) f(x) t dt π f L < +. t <π

Proof of theorem.5. We compute S N f(x) f(x) = f(x y)d N (y) dy f(x)d N (y) dy π π = f(x y) f(x) sin(n + π y )y y sin(y/) dy = π f(x y) f(x) y ( π e iy/ e iny e iy/ e iny) dy i π y sin(y/) = π π f(x y) f(x) y π y sin(y/) eiy/ e iny dy }{{} π π =:g(y) f(x y) f(x) y y sin(y/) e iy/ e iny dy. }{{} =:h(y) To complete the proof, it suffices to show that g, h L (T). For g (noting that g = h, we need that π f(x y) f(x) y y sin(y/) eiy/ dy < +. π We split the integral into the regions where y < δ, and δ < y < π, as in the previous proof. For δ < y < π, we have that M, so sin(y/) δ< y <π f(x y) f(x) e iy/ dy M f(x y) f(x) dy M f L. sin(y/) δ< y <π For y < δ, we observe that y <δ by assumption. f(x y) f(x) y y sin(y/), so that y sin(y/) eiy/ dy y <δ f(x y) f(x) y dy < +, Observe that for Dini s theorem to hold, it is in fact enough to have that there exist constants R >, α (, ] and C > such that whenever y R, we have f(x y) f(y) C y α. Such functions are called α-hölder continuous. (The definition of α-hölder continuous is often stated without the restriction that x y R; the R is strictly only needed when the domain is non-compact.) Note that if f C (T), then f is automatically -Hölder (also known as Lipschitz). The next theorem tells us that if f is of bounded variation loosely speaking, if f does not oscillate too much then S N f(x) converges to the average of the left and right limits of f at x: 3

Theorem.6 (Jordan s criterion). Let f L (T) be of bounded variation. Then S N f(x) f(x+ ) + f(x ), where f(x + ) = lim h + f(x + h), and f(x ) = lim h + f(x h). Indeed, the theorem holds (and the same proof works) if f is only BV in a neighbourhood of x. To prove Jordan s criterion, we first recall some facts about BV functions. Recall that f : [a, b] R is of bounded variation if { n } sup f(t i ) f(t i ) : a = t < t < < t n < t n = b, n N < +. i= Recall that the total variation of f is defined to be { n } T f (x) := sup f(x j ) f(x j ) : a = x < x < < x n = x i= where the sup is taken over all partitions of [a, b]. Thus, f is of bounded variation on [a, b] if and only if T f (x) is bounded on [a, b]. Furthermore, given any function f BV([a, b]), we may write f(x) = (T f(x) + f(x)) (T f(x) f(x)); observe that f + (x) := (T f(x) + f(x)) and f (x) := (T f(x) f(x)) are both monotone increasing functions. That is, a function is of bounded variation if and only if it is the difference of two monotone increasing functions. We also recall the mean value formula for integrals: Lemma.7 (Mean value formula for integrals). Let ϕ: [a, b] R be continuous and let h: [a, b] R be monotone. Then there exists c (a, b) such that b a ϕ(x)h(x) dx = h(b ) b c ϕ(x) dx + h(a + ) c a ϕ(x) dx. With this in hand, we now proceed to the proof of Jordan s criterion: Proof of theorem.6. As D N (y) is even, we can rewrite S N f(x) = π f(x y)d N (y) dy = π π π π (f(x y) + f(x + y))d N (y) dy. As every BV function f is the difference of two monotonic functions, it suffices to show that π g(y)d N (y) dy g(+ ) π 4

as N, where g is monotone, since then we can take g(y) = f(x + y) and g(y) = f(x y) to complete the result. Define g(y) = g(y) g( + ); notice that π g(y)d N (y) dy g(+ ) = π if and only if if and only if π π π π (g(y) g( + ))D N (y) dy g(y)d N (y) dy g(+ ), as π D π N(y) =. Thus, without loss of generality, suppose that g(+ ) = and that g is monotone increasing. We now use lemma.7 to prove that π π g(y)d N (y) dy as N. As g( + ) =, for every ε > there exists δ > such that g(x) < ε whenever x < δ. Then Now π I = π π π δ δ g(y)d N (y) dy = g(y)d N (y) dy + g(y)d N (y) dy. π π δ }{{}}{{} =:I g(y) sin(y/) sin(n + π )y dy = π π g(y) sin(y/) χ [δ,π] }{{} L =:I sin(n + )y dy as N, by the Riemann Lebesgue lemma. By the mean value formula above, taking h = g and ϕ = D N, we have that there exists C (, δ) such that I = δ δ g(y)d N (y) dy = g(δ ) D N (y) dy π C δ π ε sup D N (y) dy. c,δ,n C As long as the sup is finite, we can send ε and we are done, so: δ δ [ D N (y) dy = sin(n + )y C C sin(y/) ] δ dy y/ + sin(n + )y C y/ M K + sup sin(y) dy y, M> which is bounded independent of c, δ and N. This completes the proof. 5 dy

Dini s theorem and Jordan s criterion may lead one to think that getting a Fourier series to converge is relatively easy. Unfortunately, it is not. du Bois Reymond showed in 873 that even if f is continuous, it is possible for the Fourier series to diverge at a point. Theorem.8 (du Bois Reymond, 873). There exists a continuous function f : T R for which S N f(x) diverges for at least one x. Taking such an f from this theorem, and assuming (without loss of generality) that S N f() diverges, we can, by enumerating the rational numbers as (r n ) n N, construct a function g(x) = f(x r n) n= whose Fourier series diverges at every rational point. n To prove du Bois Reymond s theorem, we need the uniform boundedness principle from functional analysis: Lemma.9 (Uniform boundedness principle). If X is a normed vector space and Y is a Banach space, and T α : X Y is a collection of bounded linear maps for each α Λ (where Λ is any index set, not necessarily countable), then either (i) sup α Λ T α op <, or (ii) there exists x X such that sup α Λ T α x Y = +. Proof of theorem.8. Let X = C(T), Y = C, and consider the maps T N : C(T) C for N N given by T N f := S N f() = π f(y)d N (y) dy π π (where we have used the fact that D N is even). As D N (y) = sin(n+ )y has finitely many sin y/ zeros, so g(y) := sgn(d N (y)) is a measurable (but not continuous) function. We would like to consider L N := T N (g) = π π π D N (y) dy; indeed, the definition of T N makes sense for any f L, whence we have that T N f = S N f() L N f L and hence that T N op L N. (We have bounded the operator norm of T N as an operator from L (T) C, and used the fact that restriction never increases the norm.) Clearly by using g we can see that the operator norm of T N as an operator from L (T) C should be exactly L N ; as g is not continuous, we use the fact that continuous functions are dense to see that, given ε > there exists an h C (T) such that π π π h(y)d N (y) dy L N ε. Hence the operator norm of T N : C(T) C is T N op = L N. 6

By the uniform boundedness principle, either L N K for some K < and all n N, or there exists f C(T) such that lim sup N T N f = +. We wish to exclude the first possibility to show that there is a function f such that S N f() diverges; to do so, we prove that L N +, by showing that L N = π π π D N (y) dy = 4 log N + O(). () π To see this, we compute: L N = π sin(n + )y π π sin y/ dy = π sin(n + )y π sin y/ dy as D N is even = π sin(n + )y dy as sin on [, π/] π sin y/ = π [ sin(n + π )y sin y/ y/ + ] dy y/ = π [ sin(n + π )y sin y/ ] dy + π sin(n + )y dy. y/ π y [ ] As y is bounded, the first integral is O(). Consider the second integral sin y/ y/ under the change of variables (N + )y = πz: L N = π = = = = π π N+ N k= k N k= sin(n + )y dy + O() y k+ sin πz dz + O() πz sin πz πz dz + O() sin πz dz + O() π(z + k) sin πz N k= dz + O() z + k = π log N sin πz dz +O() }{{} /π = 4 log N + O(). π The moral of this theorem is that pointwise convergence is, quite simply, too much to ask. Kolmogorov showed just how much it is to ask that the Fourier series of a function converge pointwise: 7

Theorem. (Kolmogorov, 96). There exists f L (T) such that S N f(x) diverges at every point x T. The proof of this theorem is beyond the scope of the course. The existence of a function in L (T) whose Fourier series diverges almost everywhere is demonstrated in section 3.4. of [GraCl]. Having seen that pointwise convergence is, ultimately, fruitless in many cases, we move on to a different flavour of convergence results. Given a function f, form the partial sums of the Fourier series S N f.. Does S N f f in the L p norm?. Does S N f f almost everywhere? For the second question, Carleson and Hunt showed that this Kolmogorov s example of a function whose Fourier series diverges is largely due to the nature of L, and that considering L p for < p < actually gains us almost everywhere pointwise convergence: Theorem. (Carleson, 965). If f L (T), then S N f(x) converges to f(x) for almost every x T. Carleson s theorem, which won him the Abel prize, was extended by Hunt a few years later: Theorem. (Hunt, 967). Let < p <. If f L p (T), then S N f(x) converges to f(x) for almost every x T. Again, the proof of the Carleson Hunt theorem is beyond the scope of the course: in fact, it occupies the whole of chapter of [GraMo], and is not for the faint of heart. With regard to the first question, the main theorem on convergence in the L p norm which will take some time to build up to is the following: Theorem.3. Let < p <. If f L p (T), then S N f f in the L p norm; that is, S N f f L p. The theorem is easy enough to prove in L as it is a Hilbert space; the real meat of the theorem is for p, where L p is a Banach space but not a Hilbert space. We will prove the result using the following key step: Proposition.4. Let < p <. The following are equivalent: for all f L p (T), S N f f in the L p norm; there exists a constant c p such that, for all f L p (T) and all N N, S N f L p c p f L p. Proof. First, we show that, if S N f f in the L p norm for all f L p (T), then such a constant c p exists. Consider S N : L p (T) C as operators for each N N. Each S N is a bounded linear operator, so by the uniform boundedness principle, either 8

(i) sup N N S N op <, or (ii) there exists f L p such that lim sup N S N f L p = +. We show that (i) holds. Given ε > and f L p (T), pick N large enough so that S N f f L p < ε. Then for such large enough N, S N f L p S N f f L p + f L p ε + f L p. This holds for all ε >, and neither S N f L p nor f L p depend on ε, so we have S N f L p f L p for all f and all N large enough. Hence c p := sup S N op N N exists and is finite. For the converse that is, showing that the existence of such a constant c p guarantees that S N f f in the L p norm for all f L p we first note that trigonometric polynomials are dense in L p whenever < p < : that is, given ε > and f L p, there exists a function g : T R of the form g(x) = M k= M a k e ikx such that f g L p < ε. So, fix ε > and f L p, and let g be a trigonometric polynomial such that f g L p < ε +c p. Then whenever N > deg(g), we have that S N g = g, so that S N f f L p S N f S N g L p + S N g g L p + g f L p = S N (f g) L p + + f g L p ( + c p ) f g L p ε < ( + c p ) = ε. + c p Hence S N f f in the L p norm, for any f L p. To see that trigonometric polynomials are dense in L p (T), recall that if f C (T) then S N f f uniformly; that is, given ε >, there exists N such that for n N, S n f(x) f(x) < ε (π) /p for all x T. Observe that S N f is a trigonometric polynomial, and that ( π /p S N f f L p = S N f(x) f(x) dx) p π ε (π) /p (π)/p = ε, so the trigonometric polynomials are dense in C (T), and C (T) is dense in L p (T). 9

.3 Good Kernels and PDEs We saw that the convergence of S N f is closely related to the properties of the Dirichlet kernel D N, by the equation S N f(x) = π (f D N)(x) = f(y)d N (x y) dy. π π We exploited the fact that π D π π N(y) dy = for all N; we also exploited the fact that for fixed δ >, D N (y) δ< y <π as N. Indeed, most of the proofs relied on only these two facts. Unfortunately, we also saw that π D N (y) dy = 4 log N + O() π π as N ; if these integrals had been bounded uniformly in N some of the proofs would have been much easier. We now consider other modes of convergence, and other ways of summing Fourier series: it turns out that the partial sums can also be expressed as the convolution of f with some kernel K N. The other modes of convergence we will investigate, however, will have much nicer convergence properties than standard summation via convolution with the Dirichlet kernels. We generalise and define a good kernel as follows: Definition.5 (Good kernel). Let K n : T R for n N. (K n ) is a family of good kernels if (i) π π π K n (x) dx = ; π (ii) there exists a constant K > such that, for all n N, (iii) for every δ >, δ< x <π K n (x) dx as n. π π K n (x) dx K; and So D N satisfies (i) and (iii), but not (ii), and thus is not a good kernel. The difference this makes is readily apparent: with property (ii), we can show that, in contrast to the theorem of du Bois Reymond, if K n is a family of good kernels, then K n f(x) f(x) for every point of continuity of f: Theorem.6. Let (K n ) be a family of good kernels, and let f L (T) L (T). If f is continuous at x T, then π (K n f)(x) f(x) as n. In particular, if f C (T), then the convergence is uniform; that is, for all ε > there exists N such that for all x T and n N, π (K n f)(x) f(x) < ε.

Proof. Let x be a point of continuity of f, and fix ε >. By property (ii), there exists K such that for all n N, π π K n (x) dx K. Take δ > such that whenever y < δ, we have f(x y) f(x) < πε. Given this δ, K using property (iii) choose N such that, for n N, K n (y) dy πε. f δ< y <π Then, for n N, π (K n f)(x) f(x) = π K n (y) (f(x y) f(x)) dy π π δ K n (y) (f(x y) f(x)) dy π + δ π δ K n (y) f(x y) f(x) dy + π δ }{{} π πε K ε δ K ε K δ π K n (y) dy + f π K n (y) dy π }{{} K for all n ε + ε = ε. + f π δ< y <π πε f K n (y) dy δ y π δ y π by property (i) K n (y) (f(x y) f(x)) dy K n (y) f(x y) f(x) dy }{{} f Finally, note that if f C (T), then we may choose δ independently of x, and thus we may choose N independently of x, and hence the convergence is uniform in x..4 Cesàro Summation and the Fejér Kernel Let (a n ) n= be a sequence, and consider the n th partial sum s n = a + a + + a n. Does (s n ) converge? That question is related to the convergence of σ n = s + s + + s n. n If s n s, then σ n s as well. However, sometimes σ n will converge when s n does not. Let a n = ( ) n ; then s n = } + {{ +.. }., n times so s =, s =, s =, s 3 =, and so on. So s n does not converge. However, σ n. We call σ n the n th Cesàro mean of s n. If σ n converges to σ, but s n does not converge, we say that s n σ in the Cesàro sense.

To apply this to Fourier series, given a function f : T R, with partial Fourier sums S n f(x), define σ n f(x) := S f(x) + S f(x) + + S n f(x). n We wish to express σ n f(x) as the convolution of f with some kernel, so we compute: σ n f(x) = S f(x) + S f(x) + + S n f(x) n = [ π f(y)d (x y) dy + + n π π = [ ] π n f(y) D k (x y) dy π n π So we define the Fejér kernel as k= F n (t) = n D k (t); n k= π π ] f(y)d n (x y) dy by the above, we have that σ n f(x) = π (F n f)(x). We now express the Fejér kernel in a more convenient form. First, note that cos(kt) cos(k + )t cos((k + )t t) cos((k + )t + t) ( cos(k + )t cos(t/) + sin(k + )t sin(t/)) ( cos(k + )t cos(t/) sin(k + )t sin(t/)) sin(k + )t sin t/. Using this, and the identity sin (nt/) cos(nt), we obtain: F n (t) = n D k (t) n = n = = k= n k= sin(k + )t sin t/ n n sin t/ n sin t/ k= ( n sin(k + )t sin t/ k= = n sin t/ cos(nt) = sin (nt/) n sin (t/). (cos(kt) cos(k + )t) Having expressed the Fejér kernel in closed form, we now show that it forms a family of good kernels: )

Theorem.7. The Fejér kernel F n (t) = n n k= D k(t) is a family of good kernels. Hence, if f L (T) L (T), and f is continuous at x T, then σ n f(x) f(x). Proof. It suffices to check that the three properties of definition.5 hold: (i) We observe that π F n (x) dx = π π π π π n n k= D k (x) = n π n k= π π D k (x) =. } {{ } =π (ii) As F n (x) for all x T and all n N, we see that F n (x) = F n (x), so that π π F n(x) dx = π for all n N. (iii) Fix δ >. Whenever δ x π, we have that M sin (x/) δ for some constant M δ which depends on δ. Thus F n (x) = sin (Nt/) n sin (t/) M δ n for all x such that δ x π, and hence F n (x) dx πm δ n δ< x <π as n, as required. The theorem implies that, if x is a point of continuity of f, then there exists a sequence σ n f(x) of trigonometric polynomials which converge to f. (Hence trigonometric polynomials are dense in C (T), and hence they are dense in L p (T).) The difference between S n and σ n can be summarised as follows: Going from S n f to S n+ f, you do not change the first n + Fourier coefficients: the first n + Fourier coefficients of S n+ f are exactly the same as those of S n f. Going from σ n f to σ n+ f, you must recompute every Fourier coefficient! Corollary.8. Let f L (T). Suppose that ˆf(n) = for all n Z. Then f(x) = for all points of continuity of f; in particular, f(x) = for almost every x T. As an application of the Fejér kernel, let us exhibit an example of a function which is continuous but nowhere differentiable; that is, a continuous function f : T R such that f (x) does not exist for any x T. To do so, we prove a theorem showing that, given a function with Fourier coefficients of a particular form which is differentiable at some point, said Fourier coefficients must satisfy an estimate. We then exhibit a function which does not satisfy any such estimate, and which thus cannot be differentiable at any point. 3

Theorem.9. Let g C (T) be periodic and continuous such that { a ± m if n = ± m, for some m N ĝ(n) = otherwise If g is differentiable at x, then there exists a constant C such that whenever m. a ± m Cm m Proof. Without loss of generality we may assume that x =, since otherwise we may put h(x) = g(x x ), which has ĥ(n) = ĝ(n). Furthermore, without loss of generality we may assume that g() =, since otherwise we may consider g(x) g(), which will have the same Fourier coefficients except for n =. As g is differentiable at x =, g is locally Lipschitz around : that is, there exists K > and δ > such that whenever x < δ, we have As g is continuous, so is g(x) x g(x) K x. for δ x π; set { g(x) K := sup x } : δ x π (which is finite as [ π, δ] [δ, π] is compact). Then, for K := max{k, K }, we have that g(x) K x for all x [ π, π]. Notice that, for x [ π, π], we have sin(x/) x, so π g(x) sin (x/) Kπ x Recall that e m (x) := e imx. We claim that a + m = π g, e mf M for M = m. To see this, consider that F M (x) = M M k= j= k k e ijx = M M k= (M ) (M k )e ikx. So F M is a sum of exponentials between ( m ) and ( m ); note that the coefficient of e ikx when k = is M k =. Hence e M mf M is a sum of exponentials 4

between m ( m ) m and m + ( m ) m+, and the coefficient in front of e imx is. As the Fourier coefficients ĝ(n) of g are unless n is a power of, we see that g, e mf M = g, e m = πĝ( m ) = πa + m, as required. We now use this to estimate the value of a + m : for some constant C. a + m = π g, e mf M = π π g(x)e m(x)f M (x) dx π = π π g(x)e im x sin (Mx/) M sin (x/) dx πm Kπ πm = Kπ M = Kπ M Kπ 4M π π π π π π /M /M g(x) sin (Mx/) sin (x/) sin (Mx/) x sin (Mx/) dx x sin (Mx/) x M x x dx dx + Kπ M dx dx + Kπ M π Kπ 8M + Kπ (log M + log π) M = Kπ ( ) + log π + log M M 8 C m ( + log(m )) Cm m /M π /M x dx sin (Mx/) x dx We now define for < a <. Noting that we may write f(x) = a n cos( n x) n= cos( n x) = einx + e in x, f(x) = n Z\{} a n e inx. 5

Thus ˆf(n) = { a m if n = ± m for some m N otherwise If f is differentiable, then the coefficients must satisfy an estimate of the form a m Cm, or, equivalently, m m m a m C. We show that no such estimate can hold: observe that m m a m = ( ) m a + m / as m, since any point. a / >. Since no such estimate can hold, f cannot be differentiable at.5 Abel Summation and the Poisson Kernel Given a series k= a k, which may or may not converge, but has a k M for all k N, define A(r) = a k r k. k= As a k M, A(r) is well defined for r <. If A( ) := lim r A(r) exists, we denote it by A() and say that a k = A() k= in the Abel sense. (Of course, if k= a k converges, then A() always exists and equals k= a k in the usual sense.) For example, let us consider k= ( )k (k + ) = + 3 4 +.... Technically this does not fit in to the above definition, but the power series k= ( )k (k +)r k converges for r < ; notice that A(r) = ( ) k (k + )r k = k= ( + r). So + 3 4 + = 4 in the Abel sense. Once again, we apply this to Fourier series. Given f L (T), for r <, we define A r f(θ) = n= ˆf(n)r n e inθ. 6

Once again we want to show that A r f(θ) = π (f P r)(θ) for some kernel P r : A r f(θ) = = n= n= π = π = π ˆf(n)r n e inθ π f(x)e inx dx r n e inθ π π π π π f(x) n= f(x)p r (θ x) dx r n e in(θ x) dx by the Dominated Convergence Theorem where P r (θ) := n= r n e inθ is called the Poisson kernel. Again, we compute P r in closed form: P r (θ) = = = = n= r n e inθ r n e inθ + n= (re iθ ) n + n= ω n + ω n= r n e inθ n= (re iθ ) n n= ω n n= = ω + ω ω ( ω) + ω( ω) = ( ω)( ω) for ω := re iθ = ω ω since ω ω = ω = r re it r = r cos(θ) + r. Theorem.. The Poisson kernel P r (θ) = n= r n e inθ is a family of good kernels. Hence, if f L (T) L (T), and f is continuous at x T, then A r f(x) f(x) as r. (Technically, theorem.6 applies as n ; however, it is easy to see that the same theorem will hold for convergence as r.) Proof. Again, it suffices to check that the three properties of definition.5 hold: 7

(i) We observe that π P r (θ) dθ = π π π π π n= r n e inθ dθ = π since π π einθ dθ = unless n =, when it equals π. n= π r n e inθ dθ =, (ii) As P r (θ) for all θ T and all r [, ), we see that P r (θ) = P r (θ), so that π π P r(θ) dθ = π for all r [, ). π (iii) Fix δ >. We rewrite the denominator of P r (θ) as follows: r cos(θ) + r = ( r) + r( cos θ). Now, whenever r <, there exists C δ > such that whenever δ < θ < π, we have r cos(θ) + r C δ >. Thus δ< θ <π as r, as required. P r (θ) dθ C δ δ< θ <π r dθ π( r ) C δ As an application, we consider the Laplace equation on the unit ball B R : { u = in B u = f on B In polar coordinates (r, θ), the Laplacian becomes u = u r + u r r + u r θ. Attempting a solution by separation of variables, let u(r, θ) = F (r)g(θ); the Laplace equation then becomes Rearranging, we obtain that F G + r F G + r F G =. r F + rf F = G G = λ. Since the left-hand side does not depend on θ, and the right-hand side does not depend on r, both sides must in fact depend on neither and be constant, and thus both sides equal some constant λ R. So we obtain the coupled equations { G + λg = r F + rf λf = 8

We require that G is π-periodic, so the equation for G will have solutions if and only if λ = m for some m Z. The solutions are G(θ) = Ae imθ + Be imθ. The solutions to the second equation depend on whether m = or not. If m =, then we get the two linearly independent solutions F (r) = and F (r) = log r. On the other hand, if m, we get the two solutions F (r) = r m, F (r) = r m. We only really want solutions such that u(r, θ) is bounded on B, so we consider only the solutions F (r) = r m for m Z. Summing over all possible solutions for m Z, we thus arrive at our postulated solution, given by Poisson s formula: u(r, θ) = α n r n e inθ, n= where the α n are constants to be determined by the boundary conditions. If u(r, θ) = f(θ) at the boundary B, then we would like to have that that is, lim lim u(r, θ) = f(θ), r r n= α n r n e inθ = f(θ). Interchanging the limit and the summation, we see that there can only be one choice of coefficients α n = ˆf(n). In that case, we have that u(r, θ) = (f P r )(θ) = n= ˆf(n)r n e inθ, is a solution that satisfies the required boundary conditions; what s more, as P r is a family of good kernels, we see that (f P r )(θ) f(θ) as r for every point of continuity of f. So we have proved the following theorem: Theorem.. Let f L (T) L (T ). The unique (rotationally invariant) solution of { u = in B is given by u(r, θ) = (f P r )(θ) = n= u = f on B ˆf(n)r n e inθ = π π π r f(θ y) r cos y + r dy and satisfies lim r u(r, θ) = f(θ) for every point of continuity of f. 9

3

Fourier Transform Recall that { π e inx } n Z is an orthonormal basis of L ([ π, π]). We generalise the definition of Fourier series to an interval [ L/, L/] of length L by defining e n,l := L e πinx/l, and setting ˆf L (n) = L/ f L (x)e πinx/l dx = f L, e n,l L L/ for some f L L ([ L/, L/]). While Fourier series are an excellent tool for functions on a compact interval (which we can think of as being periodic on all of R), if we have a non-periodic function on all of R we seemingly cannot use Fourier series. In general, let us write g L (ξ) = L ˆf L (n) for ξ [πn/l, π(n + )/L]; observe that g L (ξ) dξ = π ˆf L/ L (n) = f L (x) dx. n Z L/ So in the limit as L (the period becomes infinite ), we can think of g(ξ) as some kind of Fourier transform, since formally: g(ξ) = lim L g L (ξ) =. Definition and Basic Properties f(x)e ixξ dξ = ˆf(ξ). Definition.. Let f L ( ). We define ˆf : C, the Fourier transform of f, by ˆf(ξ) = f(x)e πix ξ dx. Proposition. (Properties of the Fourier transform). The following properties of the Fourier transform hold: (i) (Linearity) Let f, g L ( ) and α, β C. Then (αf + βg) (ξ) = α ˆf(ξ) + βĝ(ξ). (ii) (Continuity) Let f L ( ). Then ˆf is continuous, and satisfies ˆf L f L. (iii) (Riemann-Lebesgue) Let f L ( ). Then lim ξ ˆf(ξ) =. (iv) (Convolution) Let f, g L ( ). Then f g(ξ) = ˆf(ξ)ĝ(ξ). (v) (Shift) Let f L ( ) and let h. For τ h f(x) = f(x + h), we have τ h f(ξ) = ˆf(ξ)e πih ξ, and for σ h f(x) = f(x)e πix h, we have σ h f(ξ) = ˆf(ξ h). (vi) (Rotation) Let f L ( ), and let Θ SO(n) be a rotation matrix. f(θ )(ξ) = ˆf(Θξ). Then (vii) (Scaling) Let f L ( ), let λ R, and define g(x) = λ n f(x/λ). Then g L ( ), and ĝ(ξ) = ˆf(λξ). 3

(viii) (Differentiation) Let f L ( ) such that (πiξ j ) ˆf(ξ). ( f x j L ( f ). Then x j )(ξ) = (ix) (Multiplication) Let f L ( ) such that g j (x) := πix j f(x) is in L ( ). If ˆf is differentiable in the ξ j direction, then ĝ j (ξ) = ξ j ˆf(ξ). Proof. We will prove each part individually. (i) Linearity of the Fourier transform follows from the linearity of the integral. (ii) For f L ( ), we have that ˆf(ξ) = f(x)e πix ξ dx f(x) dx, R n and hence f L f L. To see that ˆf is continuous, for ξ, h consider ˆf(ξ + h) ˆf(ξ) = f(x) ( e πix (ξ+h) e πix ξ) dx. Noticing that the integrand is dominated by f, by the dominated convergence theorem the limit as h exists and equals, and hence ˆf is continuous. (iii) This is the analogue to the Riemann Lebesgue lemma for Fourier series. Suppose that f C ( ). Consider that ˆf(ξ) = f(x)e πix ξ dx () R n = f(x)e πi(x+e n ξn ) ξ dx where e n = (,..., }{{},..., ) n th = f(z ξ n e n )e πiz ξ dz where z = x + e n. (3) R ξ n n From () and (3) we obtain that ˆf(ξ) = ( ) f(x)e πix ξ dx f(x ξ n e n )e πix ξ dx = ( ) f(x) f(x ξ n e n ) e πix ξ dx. ( ) As ξ, we have that f(x) f(x ξ n e n ) for each x (since f is continuous); as the integrand is dominated by f, by the dominated convergence theorem we have that lim ξ ˆf(ξ) =. This proves the result for all f C ( ). In general, let f L ( ), and fix ε >. As C ( ) is dense in L ( ), pick g C ( ) such that f g L < ε/. Then by property (ii), we have that 3

ˆf ĝ L f g L < ε/. Furthermore, as g is continuous, we know that there exists C such that when ξ > C we have ĝ(ξ) < ε/. Then for ξ > C, we have ˆf(ξ) ˆf(ξ) ĝ(ξ) + ĝ(ξ) ˆf ĝ L + ĝ(ξ) < ε + ε = ε. (iv) The result for convolutions is just an application of Fubini s theorem: f g(ξ) = (f g)(x)e πix ξ dx = f(x y)g(y) dy e πix ξ dx = f(x y)g(y)e πiy ξ dy e πiy ξ e πix ξ dx R n = g(y)e πiy ξ dy f(x y)e πi(x y) ξ dx. The result follows after changing variables. (v) For τ h f(x) = f(x + h), we compute that τ h f(ξ) = τ h f(x)e πix ξ dx R n = f(x + h)e πix ξ dx R n = f(z)e πi(z h) ξ dz = e πih ξ ˆf(ξ). For σ h f(x) = f(x)e πix h, we compute that σ h f(ξ) = σ h f(x)e πix ξ dx R n = f(x)e πix h e πix ξ dx R n = f(x)e πix (ξ h) dx = ˆf(ξ h). (vi) Let Θ SO(n) be a rotation matrix. As Lebesgue measure is rotationally invariant, we see that f(θ )(ξ) = f(θx)e πix ξ dx = f(z)e πi(θ z) ξ dz. 33

Noting that Θ z ξ = Θ T z ξ = x Θξ, we see that f(θ )(ξ) = ˆf(Θξ), as required. As a corollary of (vi), note that the Fourier transform of a radial function is radial (recall that f is radial if f(θx) = f(x) for all Θ SO(n)), since ˆf(Θξ) = f(θ )(ξ) = ˆf(ξ). (vii) For g(x) = f(x/λ), we have that λ n g L = g(x) dx = f(x/λ) dx = f(z) dz = f R λ n n L, and ĝ(ξ) = g(x)e πix ξ dx = f(x/λ)e πix ξ dx R λ n n πi(x/λ) λξ dx = f(x/λ)e R λ n n = f(y)e πi(y) λξ dy putting y = x/λ = ˆf(λξ). (viii) Using integration by parts, we see that ( ) f f (ξ) = (x)e πix ξ dx x j x j = πiξ j f(x)e πix ξ dx = πiξ j ˆf(ξ). (ix) For g j (x) := πix j f(x), we see that ĝ j (x) = f(x)( πix j e πix ξ ) dx R n = f(x) e πix ξ dx R ξ n j as required. = ξ j ˆf(ξ), From part (ii) of proposition., we have that ˆf L f L. In a finite measure space X (such as on a compact interval such as [ π, π], or in T), L (X) L (X). However, this is not true in general: g L ( ) does not imply g L ( ). We would like to be able to say f(x) = ˆf(ξ)e πix ξ dξ, but this makes no sense if f is only in L, since we only know that ˆf is in L, not L. We are thus forced to develop a slightly different theory of the Fourier transform on L. 34

. Schwartz Space and the Fourier Transform We now consider the class S of Schwartz functions which are so nice that the Fourier transform of a Schwartz function is another Schwartz function. We have: C c ( ) S( ) C ( ). The functions Cc of compact support are integrable, but there aren t very many of them. (Recall that the support of a function f : X R is defined as spt f := {x : f(x) }, where the line denotes closure, and f Cc (X) if spt f is compact.) However, move to the larger class of C functions and you know nothing about integrability. We define a set between these two, which is rich enough to contain lots of useful functions, but small enough that we can control the integrability of these functions. Let us fix some notation: let x = (x,..., x n ), and set x = (x + + x n) /. We define a multi-index to be an element α = (α,..., α n ) (N {}) n, and write α = α + +α n and α! = α! α n!. We define x α := (x α,..., x αn n ), that is, the result of raising each element of x to the corresponding power of α. Observe that x α c n,α x α for some constant c n,α, since for x = the function x x α is continuous on S n and hence attains its maximum and minimum, and the result follows by homogeneity of the Euclidean norm. Similarly, for k N, we have x k c n,k β =k xβ. If f : C is sufficiently differentiable, we write α f = α f x α := α f x α.... x αn With this notation, the Leibniz rule for the derivative of a product is n α (fg) x α = β α α! β f α β g β!(α β)! x β x, α β where β α if and only if β j α j for each j =,..., n. Definition.3. Let f : C be a function in C ( ). For multi-indices α, β, define ρ α,β (f) := sup x x α β f(x). We define the Schwartz class S of functions C as S( ) := {f C ( ) : ρ α,β (f) < for all α, β (N {}) n }. The ρ α,β are seminorms; that is, for all α, β (N {}) n, f, g S, λ, µ C, we have (i) ρ α,β (f) ; (ii) ρ α,β (λf) = λ ρ α,β (f), and (iii) ρ α,β (λf + µg) λ ρ α,β (f) + µ ρ α,β (g). 35

That is, they are norms except that ρ α,β (f) = does not (necessarily) imply that f =. It is clear from the definition that, given f S( ), we have that p( )f( ) S for any polynomial p on, and α f S( ) for any multi-index α. Note further that f S( ) if, and only if, for every natural number N and every multi-index α, there exists a constant c α,n such that α f c α,n ( + x ) N. (4) For example, consider f Cc ( ). By definition, there exists a compact set K such that {x : f(x) } K. Hence each ρ α,β (f) is zero outside K, and since every continuous function on a compact set is bounded ρ α,β (f) < + and hence that every function in Cc ( ) is in S( ). As Cc ( ) is dense in L p ( ), we see that S( ) is dense in L p ( ) (for p < ). However, these are not all the functions in S( ). For example, the function x e x is a Schwartz function, since it decays at infinity faster than any polynomial. However, the function x is not in Schwartz space, since multiplying by x 3α yields an (+x ) α unbounded function. Definition.4. Let f k be a sequence in S( ), and let f S( ). We say that f k f in S( ) if, and only if, for every multi-index α and β we have as k. ρ α,β (f k f) = sup x x α β (f k f) This defines a topology on S( ), and addition, scalar multiplication, and differentiation are continuous operators under this topology. What s more, if we let {ρ j } j= be some enumeration of the seminorms ρ α,β, then d(f, g) := j= j ρ j (f, g) + ρ j (f, g) defines a complete metric on S( ), and S( ) is locally convex under this metric. Thus S( ) is an example of a Fréchet space: see the appendix to [Fri&Jos] for more details. Theorem.5. If f k f in S( ), then for any multi-index β we have that β f k β f in L p ( ) as k. Proof. Set g k = f k f. Then β g k p L = R β g p k (x) p dx n = β g k (x) p dx + x < β g k p x < x x n+ β g k (x) p dx x n dx + sup x ( x n+ β g k (x) p) x ( c n,p ( β g k + sup x (n+)/p β g k (x) )) p x dx x n 36

as k. Thus β f k β f in L p ( ), as required. We now prove that the Fourier transform maps Schwartz space to Schwartz space, and that the mapping is continuous and invertible. Theorem.6. The Fourier transform, ˆ : S( ) S( ), given by ˆf(ξ) = f(x)e πix ξ dx, is a continous linear operator, such that for all f, g S( ) we have f(x)ĝ(x) dx = ˆf(x)g(x) dx and that for all f S( ) and all x we have f(x) = ˆf(ξ)e πix ξ dξ. In order to prove it, we prove the following lemma: Lemma.7. If f(x) = e π x, then ˆf(ξ) = e π ξ ; that is, ˆf = f. Proof. Since f is radial, it suffices to prove this on the real line. As f (x) = πxe πx, notice that f : R R, f(x) = e πx is the unique solution of { u + πxu = (5) u() = We show that ˆf also solves equation (5). First notice that so We now compute ˆf : ˆf(ξ) = ˆf() = ˆf (ξ) = d dξ = = = i (f )(ξ) = πξ ˆf(ξ) e πx e πixξ dx, e πx dx =. e πx e πixξ dx ξ e πx e πixξ dx πixe πx e πixξ dx by part (viii) of proposition.. Thus ˆf = f. 37

Proof of theorem.6. We divide the proof into three parts. First, we prove that if f S( ) then ˆf S( ); that is, for all multi-indices α and β we have that sup ξ ξ α β ˆf(ξ) <. Using parts (viii) and (ix) of proposition., we see that Thus we see that ξ α β ˆf(ξ) = ξ α ( πi) β (xβ f) = (πi) α ( πi) β ( α x β f). ξ α β ˆf(ξ) L = c ( α x β f) L c α x β f L c c α,n x β f ( + x ) N by (4), for all N N L c f L x β ( + x ) N, L and xβ L is finite whenever N > β + n +. Hence f S( ). (+ x ) N For the second part, we prove that, for all f, g S( ), we have f(x)ĝ(x) dx = ˆf(x)g(x) dx. This is an application of Fubini s theorem: f(x)ĝ(x) dx = f(x) g(y)e πix y dy dx R n = f(x)g(y)e πix y dx dy R n = g(y) f(x)e πix y dx dy R n = g(y) ˆf(y) dy. Finally, we prove that for all f S( ) we have f(x) = ˆf(ξ)e πix ξ dξ. Fix f S( ). Given g S( ), put g λ (x) = g(x/λ). By the previous part, we have λ n that ˆf(x)gλ (x) dx = f(x)ĝ λ (x) dx. By part (vii) of proposition., we have that ĝ λ (x) = ĝ(λx). So, ˆf(x) R λ g( x) dx = f(x)ĝ(λx) dx = f( x )ĝ(x) dx, n n λ R λ n n λ 38

where the second equality arises from a change of variables x x/λ. Cancelling the /λ n from both sides, we obtain x ˆf(x)g( ) dx = f( λ R x )ĝ(x) dx. λ n By the Dominated Convergence Theorem, as λ, we see that g() ˆf(x) dx = f() ĝ(x) dx. In particular, for g(x) = e π x, by lemma.7 we have that f() = ˆf(x) dx; that is, that f(x) = ˆf(ξ)e πix ξ dξ for x =. Now, as in part (v) of proposition., define τ x f(y) = f(y + x). Then f(x) = τ x f() = τx f(ξ) dξ = ˆf(ξ)e πix ξ dξ, by part (v) of proposition.. This completes the proof of the theorem. We can thus make the following definition: Definition.8. Given f S( ), we define the inverse Fourier transform ˇf S( ) by ˇf(x) = f(ξ)e πix ξ dξ. Observe that ˇf(x) = ˆf( x). Theorem.6 thus has the following corollary: Corollary.9. For all f S( ), we have that ˇˆf = ˆˇf = f. As a consequence of the definition of the Fourier transform for Schwartz functions, we obtain the following two very important results due to Parseval and Plancherel: Proposition. (Parseval). For all f, g S( ), we have that f, g L = f(x)g(x) dx = ˆf(ξ)ĝ(ξ) dξ = ˆf, ĝ L. Proof. Fix f, g S( ). By the second part of theorem.6, we have that f(x)ĥ(x) dx = ˆf(x)h(x) dx Rn 39

for any h S( ). In particular, put h(x) = ĝ(x). Then we see that ĥ(ξ) = h(x)e πix ξ dx R n = ĝ(x)e πix ξ dx R n = ĝ(x)e πix ξ dx = g(ξ) by the third part of theorem.6. Hence f(x)g(x) dx = ˆf(x)ĝ(x) dx. By taking g = f, we obtain the following corollary: Corollary. (Plancherel). For all f S( ), we have that f L R = f(x)f(x) dx = ˆf(ξ) ˆf(ξ) dξ = ˆf L. n.3 Extending the Fourier Transform to L p ( ) Having defined the Fourier transform as a continuous map ˆ : S( ) S( ) and shown that it is invertible there, we now seek to extend it to L p ( ) for suitable values of p. We begin by considering the extension to L ( ). Corollary. tells us that ˆ is a bounded linear operator on S( ) L L ( ), since f L = ˆf L. As S( ) is dense in L L ( ), we can extend the Fourier transform to a unique operator ˆ : L L ( ) L ( ). Since we are in a subset of L ( ), the properties of proposition. all hold. As L L ( ) is dense in L ( ), we may extend the Fourier transform ˆ : L L ( ) L ( ) to a unique operator F(f): L ( ) L ( ). The immediate natural question is, given f L ( ), whether f(x)e πix ξ dx converges, and whether it equals F(f). Lemma.. If f k L L ( ), and f k f in L ( ), then ˆf k is a Cauchy sequence in L ( ). Proof. We have ˆf j ˆf k L = (f j f k ) L = f j f k L, so whenever (f k ) is Cauchy, so is ( ˆf k ). Given a sequence f k L L ( ) such that f k f for f L ( ), we define F(f) as the L limit of ˆ(f k ), that is ˆf k F(f) L. (It is easy to check that this is independent of the sequence (f k ) chosen.) For an example of such a sequence, given 4

f L ( ), we may define f k = fχ Bk, where B k = {x : x k} is the ball of radius k about the origin. Note that f k (x) dx = f(x) χ Bn dx f L (vol B n ) /, so that f k L L ( ). Since f f k = f ( χ Bk ) f, and f L, by the Dominated Convergence Theorem we have that f k f in L ( ). Now, from measure theory we know that if f k f in L p (µ), then f k f in measure (with respect to µ); and if f k f in measure (with respect to µ), there exists a subsequence f kj f which converges pointwise µ-almost everywhere. Thus, there exists a sequence k j such that f kj F(f) pointwise almost everywhere, i.e. such that lim j x k j f(x)e πix ξ dx = F(f)(ξ) for almost every ξ. For the real line, i.e. n =, it turns out that all subsequences converge pointwise almost everywhere, so in fact the pointwise (a.e.) limit lim n ˆf n (ξ) = lim f(x)e n x n πixξ dx exists and equals F(f). It would be nice if the same is true in dimension and higher; however, the answer is not known and this remains an open problem! One can use the same procedure to define the inverse Fourier transform on L ( ): denote by F the extension of ˇ to L. Let f k be a sequence in S( ) such that f k f in L ( ); then for each k we know that ˇf k (x) = ˆf k ( x), so that F (f)(x) = F(f)( x) for every f L ( ), as before. By convention, for f L we write ˆf for F(f), and ˇf for F (f). Unfortunately, there is no way of extending the Fourier transform to L in such a way that the operation is invertible. So the next question is whether or not we can extend the Fourier transform to L p ( ) for, say, < p <. Definition.3. Let < p <, and let f L p ( ). For a decomposition f = f + f where f L ( ) and f L ( ), we define the Fourier transform of f by ˆf = ˆf + ˆf. For an example of such a decomposition, take f = fχ Bn and f = f( χ Bn ). Lemma.4. Let < p <, and let f L p ( ). The Fourier transform of f, as defined above, is independent of the decomposition chosen; that is, if f = f +f = g +g are two decompositions, with f, g L ( ) and f, g L ( ), then ˆf + ˆf = ĝ + ĝ. Proof. If f = f + f = g + g, with f, g L ( ) and f, g L ( ), then L ( ) f g = g f L ( ), so that f g, g f L L ( ). Thus we may take their Fourier transform as functions in L, and their Fourier transforms will agree, that is ˆf ĝ = ĝ ˆf, and hence ˆf + ˆf = ĝ + ĝ, as required. 4