JUHA KINNUNEN. Real Analysis

Similar documents
JUHA KINNUNEN. Harmonic Analysis

CHAPTER 6. Differentiation

HARMONIC ANALYSIS. Date:

MATHS 730 FC Lecture Notes March 5, Introduction

Real Analysis Notes. Thomas Goller

Geometric intuition: from Hölder spaces to the Calderón-Zygmund estimate

MATH MEASURE THEORY AND FOURIER ANALYSIS. Contents

THEOREMS, ETC., FOR MATH 515

212a1214Daniell s integration theory.

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Summary of Real Analysis by Royden

Lebesgue s Differentiation Theorem via Maximal Functions

Differentiation of Measures and Functions

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

3 Integration and Expectation

Lebesgue Integration on R n

Lecture 4 Lebesgue spaces and inequalities

Introduction to Real Analysis Alternative Chapter 1

THE HARDY LITTLEWOOD MAXIMAL FUNCTION OF A SOBOLEV FUNCTION. Juha Kinnunen. 1 f(y) dy, B(x, r) B(x,r)

Singular Integrals. 1 Calderon-Zygmund decomposition

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

HILBERT SPACES AND THE RADON-NIKODYM THEOREM. where the bar in the first equation denotes complex conjugation. In either case, for any x V define

l(y j ) = 0 for all y j (1)

Some Background Material

Metric Spaces and Topology

MTH 404: Measure and Integration

REAL AND COMPLEX ANALYSIS

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

( f ^ M _ M 0 )dµ (5.1)

L p Spaces and Convexity

Examples of Dual Spaces from Measure Theory

Review of measure theory

Both these computations follow immediately (and trivially) from the definitions. Finally, observe that if f L (R n ) then we have that.

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

CHAPTER VIII HILBERT SPACES

Notation. General. Notation Description See. Sets, Functions, and Spaces. a b & a b The minimum and the maximum of a and b

Lecture Notes in Advanced Calculus 1 (80315) Raz Kupferman Institute of Mathematics The Hebrew University

5 Measure theory II. (or. lim. Prove the proposition. 5. For fixed F A and φ M define the restriction of φ on F by writing.

Tools from Lebesgue integration

An introduction to some aspects of functional analysis

Overview of normed linear spaces

NECESSARY CONDITIONS FOR WEIGHTED POINTWISE HARDY INEQUALITIES

Measure Theory on Topological Spaces. Course: Prof. Tony Dorlas 2010 Typset: Cathal Ormond

Sobolev Spaces. Chapter Hölder spaces

Recall that if X is a compact metric space, C(X), the space of continuous (real-valued) functions on X, is a Banach space with the norm

The Heine-Borel and Arzela-Ascoli Theorems

6. Duals of L p spaces

Math 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras

+ 2x sin x. f(b i ) f(a i ) < ɛ. i=1. i=1

Maximal Functions in Analysis

Existence and Uniqueness

Integral Jensen inequality

Measures. Chapter Some prerequisites. 1.2 Introduction

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

CONTENTS. 4 Hausdorff Measure Introduction The Cantor Set Rectifiable Curves Cantor Set-Like Objects...

Reminder Notes for the Course on Measures on Topological Spaces

Math The Laplacian. 1 Green s Identities, Fundamental Solution

SHARP INEQUALITIES FOR MAXIMAL FUNCTIONS ASSOCIATED WITH GENERAL MEASURES

REAL VARIABLES: PROBLEM SET 1. = x limsup E k

ON A MAXIMAL OPERATOR IN REARRANGEMENT INVARIANT BANACH FUNCTION SPACES ON METRIC SPACES

Finite-dimensional spaces. C n is the space of n-tuples x = (x 1,..., x n ) of complex numbers. It is a Hilbert space with the inner product

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

ANALYSIS OF FUNCTIONS (D COURSE - PART II MATHEMATICAL TRIPOS)

Introduction and Preliminaries

Lebesgue-Radon-Nikodym Theorem

CHAPTER I THE RIESZ REPRESENTATION THEOREM

1 Topology Definition of a topology Basis (Base) of a topology The subspace topology & the product topology on X Y 3

Lebesgue measure and integration

Three hours THE UNIVERSITY OF MANCHESTER. 24th January

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

The Dirichlet s P rinciple. In this lecture we discuss an alternative formulation of the Dirichlet problem for the Laplace equation:

3. (a) What is a simple function? What is an integrable function? How is f dµ defined? Define it first

HARMONIC ANALYSIS TERENCE TAO

Duality of multiparameter Hardy spaces H p on spaces of homogeneous type

Functional Analysis, Stein-Shakarchi Chapter 1

Lebesgue Measure on R n

ANALYSIS IN METRIC SPACES

Real Analysis Problems

Functional Analysis I

Integration on Measure Spaces

Real Analysis Math 131AH Rudin, Chapter #1. Dominique Abdi

MATH 426, TOPOLOGY. p 1.

9 Radon-Nikodym theorem and conditioning

Indeed, if we want m to be compatible with taking limits, it should be countably additive, meaning that ( )

A VERY BRIEF REVIEW OF MEASURE THEORY

The Lebesgue Integral

01. Review of metric spaces and point-set topology. 1. Euclidean spaces

Rudin Real and Complex Analysis - Harmonic Functions

NOTES ON THE REGULARITY OF QUASICONFORMAL HOMEOMORPHISMS

Measure, Integration & Real Analysis

S chauder Theory. x 2. = log( x 1 + x 2 ) + 1 ( x 1 + x 2 ) 2. ( 5) x 1 + x 2 x 1 + x 2. 2 = 2 x 1. x 1 x 2. 1 x 1.

Math212a1411 Lebesgue measure.

Topological properties of Z p and Q p and Euclidean models

Measure and Integration: Solutions of CW2

Signed Measures. Chapter Basic Properties of Signed Measures. 4.2 Jordan and Hahn Decompositions

Measure Theory. John K. Hunter. Department of Mathematics, University of California at Davis

Analysis Comprehensive Exam Questions Fall 2008

Transcription:

JUH KINNUNEN Real nalysis Department of Mathematics and Systems nalysis, alto University Updated 3 pril 206

Contents L p spaces. L p functions..................................2 L p norm.................................... 4.3 L p spaces for 0 < p <............................ 3.4 Completeness of L p............................. 4.5 L space................................... 9 2 The Hardy-Littlewood maximal function 25 2. Local L p spaces............................... 25 2.2 Definition of the maximal function.................... 27 2.3 Hardy-Littlewood-Wiener maximal function theorems........ 29 2.4 Lebesgue s differentiation theorem................... 37 2.5 The fundamental theorem of calculus.................. 42 2.6 Points of density............................... 43 2.7 The Sobolev embedding.......................... 47 3 Convolutions 5 3. Two additional properties of the L p spaces............... 5 3.2 Definition of the convolution........................ 55 3.3 pproximations of the identity...................... 60 3.4 Pointwise convergence........................... 62 3.5 Convergence in L p............................. 65 3.6 Smoothing properties............................ 67 3.7 The Poisson kernel.............................. 70 4 Differentiation of measures 72 4. Covering theorems............................. 72 4.2 The Lebesgue differentiation theorem for Radon measures..... 8 4.3 The Radon-Nikodym theorem....................... 85 4.4 The Lebesgue decomposition....................... 89 4.5 Lebesgue and density points revisited.................. 92 5 Existence, convergence and compactness for Radon measures 94

CONTENTS ii 5. The Riesz representation theorem for L p spaces............ 94 5.2 Partitions of unity............................... 00 5.3 The Riesz representation theorem for Radon measures........ 03 5.4 Weak convergence and compactness of Radon measures.... 0 5.5 Weak convergence in L p.......................... 3

The L p spaces are probably the most important function spaces in analysis. This section gives basic facts about L p spaces for general measures. These include Hölder s inequality, Minkowski s inequality, the Riesz-Fischer theorem which shows the completeness and the corresponding facts for the L space. L p spaces In this section we study the L p spaces in order to be able to capture finer quantitative information on the size of measurable functions and effect of operators on such functions. The cases 0 < p <, p =, p = 2, < p < and p = are different in character, but they are all play an important role in modern analysis, for example, in Fourier and harmonic analysis, functional analysis and partial differential equations. The space L of integrable functions plays a central role in measure and integration theory. The Hilbert space L 2 of square integrable functions is important in the study of Fourier series. Many operators that arise in applications are bounded in L p for < p <, but the limit cases L and L require a special attention.. L p functions Definition.. Let µ be an outer measure on R n, R n a µ-measurable set and f : R n [, ] a µ-measurable function. Then f L p (), p <, if T H E ( /p f p = f dµ) p <. M O R L : For p =, f L () if and only if f is integrable in. For p <, f L p () if and only if f p is integrable in. Remark.2. The measurability assumption on f essential in the definition. For example, let [0, ] be a non-measurable set with respect to the one-dimensional Lebesgue measure and consider f : [0,] R,, x, f (x) =, x [0,] \.

CHPTER. L P SPCES 2 Then f 2 = is integrable on [0,], but f is not a Lebesgue measurable function. Example.3. Let f : R n [0, ], f (x) = x n and assume that µ is the Lebesgue measure. Let = B(0,) = {x R n : x < } and denote i = B(0,2 i+ ) \ B(0,2 i ), i =,2,... Then x np dx = x np dx B(0,) i 2 npi dx (x i x < 2 i+ x np 2 npi ) i = 2 npi i 2 npi B(0,2 i+ ) = Ω n 2 npi (2 i+ ) n (Ω n = B(0,) ) = Ω n 2 npi ni+n = 2 n Ω n 2 in(p ) <, if n(p ) < 0 p <. Thus f L p (B(0,)), if p <. On the other hand, x np dx = x np dx B(0,) i 2 np(i ) dx (x i x 2 i x np 2 np(i ) ) i = 2 np(i ) i = Ω n (2 n )2 np 2 npi 2 in ( i = B(0,2 i+ ) B(0,2 i ) = Ω n (2 ( i+)n 2 in ) = Ω n (2 n )2 in ) = C(n, p) 2 in(p ) =, if n(p ) 0 p. Thus f L p (B(0,)), if p. This shows that f L p (B(0,)) p <. If = R n \ B(0,), then we denote i = B(0,2 i ) \ B(0,2 i ), i =,2,..., and a similar argument as above shows that f L p (R n \ B(0,)) p >. Observe that f L (B(0,)) and f L (R n \ B(0,)). Thus f (x) = x n is a borderline function in R n as far as integrability is concerned. T H E M O R L : The smaller the parameter p is, the worse local singularities an L p function may have. On the other hand, the larger the parameter p is, the more an L p function may spread out globally.

CHPTER. L P SPCES 3 Example.4. Suppose that f : R n [0, ] is radial. Thus f depends only on x and it can be expressed as f ( x ), where f is a function defined on [0, ). Then f ( x ) dx = ω n f (r)r n dr, (.) R n where ω n = 2πn/2 Γ(n/2) is the (n )-dimensional volume of the unit sphere B(0,) = {x R n : x = }. Let us show how to use this formula to compute the volume of a ball B(x, r) = {y R n : y x < r}, x R n and r > 0. Denote Ω n = m(b(0,)). By the translation and scaling invariance, we have r n Ω n = r n m(b(0,)) = m(b(x, r)) = m(b(0, r)) = χ B(0,r) (y) d y = χ (0,r) ( y ) d y R n R n r = ω n ρ n r n dρ = ω n 0 n. In particular, it follows that ω n = nω n and Let r > 0. Then R n \B(0,r) and, in a similar way, B(0,r) m(b(x, r)) = 2πn/2 r n Γ(n/2) n = πn/2 Γ( n 2 + ) rn. x α dx = R n x α χ R n \B(0,r)(x) dx = r n R n rx α χ R n \B(0,r)(rx) dx = r n α = r n α dx = rn α x α 0 R n x α χ R n \B(0,)(x) dx R n \B(0,) B(0,) dx <, α > n, x α dx <, α < n. x α Observe, that here we formally make the change of variables x = r y. and On the other hand, the integrals can be computer directly by (.). This gives R n \B(0,r) x α dx = ω n ρ α ρ n dρ r = ω n α + n ρ α+n = ω n α n r α+n <, α > n B(0,r) x α dx = ω n r ρ α ρ n dρ 0 r = ω n α + n ρ α+n 0 r = ω n α n rn α <, α < n.

CHPTER. L P SPCES 4 Remarks.5: Formula (.) implies following claims: () If f (x) c x α in a ball B(0, r), r > 0, for some α < n, then f L (B(0, r)). On the other hand, if f (x) c x α in B(0, r) for some α > n, then f L (B(0, r)). (2) If f (x) c x α in R n \ B(0, r) for some α > n, then f L (R n \ B(0, r)). On the other hand, if f (x) c x α in R n \ B(0, r) for some α < n, then f L (R n \ B(0, r)). Remark.6. f L p () = f (x) < for µ-almost every x. Reason. Let i = {x : f (x) i}, i =,2,... Then and {x : f (x) = } = {x : f (x) = } µ( i ) = i i p i dµ i ( ) f p dµ ( f i in i ) i f p dµ } {{ } < The converse is not true, as the previous example shows. 0 as i..2 L p norm If f L p (), p <, the norm of f is the number ( /p f p = f dx) p. We shall see that this has the usual properties of the norm: () (Nonnegativity) 0 f p <, (2) f p = 0 f = 0 µ-almost everywhere, (3) (Homogeneity) af p = a f p, a R, (4) (Triangle inequality) f + g p f p + g p. The claims () and (3) are clear. For p =, the claim (4) follows from the pointwise triangle inequality f (x) + g(x) f (x) + g(x). For p >, the claim (4) is not trivial and we shall prove it later in this section. Let us recall how to prove (2). Recall that if a property holds except on a set of µ measure zero, we say that it holds µ-almost everywhere.

CHPTER. L P SPCES 5 : ssume that f = 0 µ-almost everywhere in. Then Thus f p = 0. f p dµ = { f =0} f p dµ }{{} + { f >0} f p dµ = 0. }{{} = 0, = 0, f = 0 µ-a.e. µ( { f > 0}) = 0 : ssume that f p = 0. Let i = { x : f (x) i }, i =,2,... Then {x : f (x) > 0} = i and µ( i ) = dµ i f p dµ i p f p dµ = 0. (i f in i ) i i }{{} =0 Thus µ( i ) = 0 for every i =,2,... and ( ) µ i µ( i ) = 0. In other words, f = 0 µ-almost everywhere in. If f and g are two µ-measurable functions on a µ-measurable set, then we shall be very much interested in the case f (x) = g(x) for µ-almost every x. Of course, this means that µ({x : f (x) g(x)}) = 0. In the case f = g µ-almost everywhere, we do not usually distinguish f from g. That is, we shall regard them as equal. We could be very formal and introduce the equivalence relation f g f = g µ-almost everywhere in but this is hardly necessary. In practice, we are thinking f as the equivalence class of all functions which are equal to f µ-almost everywhere in. Thus L p () actually consists of equivalence classes rather than functions, but we shall not make the distinction. Indeed, in measure and integration theory we cannot distinguish f from g, if the functions are equal µ-almost everywhere. In fact, if f = g µ-almost everywhere in, then f L p () g L p () and f g p = 0. In particular, this implies that f p = g p. On the other hand, if f g p = 0, then f = g µ-almost everywhere in. nother situation that frequently arises is that the function f is defined only almost everywhere. Then we say that f is measurable if and only if its zero extension to the whole space is measurable. Observe, that this does not affect the L p norm of f. Next we show that L p () is a vector space.

CHPTER. L P SPCES 6 Lemma.7. (i) If f L p (), then af L p (), a R. (ii) If f, g L p (), then f + g L p (). Proof. (): af p dµ = a p f p dµ <. (2) p = : The triangle inequality f + g f + g implies f + g dµ f dµ + g dµ <. < p < : The elementary inequality (a + b) p (a + b) p (2max(a, b)) p = 2 p max(a p, b p ) 2 p (a p + b p ), a, b R, 0 < p < (.2) implies ( ) f + g p dµ 2 p f p dµ + g p dµ <. Remark.8. Note that the proof applies for 0 < p <. Thus L p () is a vector space for 0 < p <. However, it will be a normed space only for p as we shall see later. Remark.9. more careful analysis gives the useful inequality Remarks.0: (a + b) p 2 p (a p + b p ), a, b R, p <. (.3) () If f : C is a complex-valued function, then f is said to be µ-measurable if and only if Re f and Im f are µ-measurable. We say that f L () if Re f L () and Im f L (), and we define f dµ = Re f dµ + i Im f dµ, where i is the imaginary unit. This integral satisfies the usual linearity properties. It also satisfies the important inequality f dµ f dµ. The definition of the L p spaces and the norm extends in a natural way to complex-valued functions. Note that the property af p = a f p for every a C and thus L p is a complex vector space.

CHPTER. L P SPCES 7 (2) The space L 2 () is an inner product space with the inner product f, g = f g dµ, f, g L 2 (). Here g is the complex conjugate which can be neglected if the functions are real-valued. This inner product induces the standard L 2 -norm, since ( /2 ( /2 f 2 = f dµ) 2 = f f dµ) = f, f /2. (3) In the special case that = N and µ is the counting measure, the L p (N) spaces are denoted by l p and { } l p = (x i ) : x i p <, p <. Here (x i ) is a sequence of real (or complex) numbers. In this case, x dµ = x(i) N for every nonnegative function x on N. Thus ( ) /p x p = x i p. Note that the theory of L p spaces applies to these sequence spaces as well. Definition.. Let < p <. The Hölder conjugate p of p is the number which satisfies p + p =. For p = we define p = and if p =, then p =. Remark.2. Note that p = p p, p = 2 = p = 2, < p < 2 = p > 2, 2 < p < = < p < 2, p = p, (p ) = p. Lemma.3 (Young s inequality). Let < p <. Then for every a 0, b 0, with equality if and only if a p = b p. ab ap p + bp p,

CHPTER. L P SPCES 8 T H E M O R L : Young s inequality is a very useful tool in splitting a product to a sum. Morever, it shows where the conjugate exponent p comes from. Proof. The claim is obviously true, if a = 0 or b = 0. Thus we may assume that a > 0 and b > 0. Clearly ab ap p + bp p a p + ( p b p p ab p 0 p Let t = a/b p /p and define ϕ : (0, ) R, a b p /p ) p + p a b p /p 0 ϕ(t) = p tp + p t. Then ϕ(0) = p, lim ϕ(t) = and t ϕ (t) = t p. Note that ϕ (t) = 0 t =, from which we conclude ϕ(t) ϕ() = p + = 0 for every t > 0. p Moreover, ϕ(t) > 0, if t. It follows that ϕ(t) = 0 if and only if a/b p /p = t =. Remarks.4: () Young s inequality for p = 2 follows immediately from (a b) 2 0 a 2 2ab + b 2 0 a2 2 + b2 ab 0. 2 (2) Young s inequality can be also proved geometrically. To see this, consider the curves y = x p and the inverse x = y /(p ) = y p. Then a 0 x p dx = ap p and b 0 y p d y = bp p. By comparing the areas under the curves that these integrals measure, we have ab a 0 b x p dx + y p d y = ap 0 p + bp p. Theorem.5 (Hölder s inequality). Let < p < and assume that f L p () and g L p (). Then f g L () and ( ) /p ( ) /p f g dµ f p dµ g p dµ. Moreover, an equality occurs if and only if there exists a constant c such that f (x) p = c g(x) p for µ-almost every x. T H E of functions. M O R L : Hölder s inequality is very useful tool in estimating a product

CHPTER. L P SPCES 9 Remark.6. Hölder s inequality states that f g f p g p, < p <. Observe that for p = 2 this is the Cauchy-Schwarz inequality f, g f 2 g 2. Proof. If f p = 0, then f = 0 µ-almost everywhere in and thus f g = 0 µ-almost everywhere in. Thus the result is clear, if f p = 0 or g p = 0. We can therefore assume that f p > 0 and g p > 0. Define Then By Young s inequality f p g p f = f p = f f p f and g = g. f p g p p f g dµ = = f p f p = and g p =. f g dµ f p dµ + p p }{{} g p dµ = p + p }{{} =. = = n equality holds if and only if ( p f p + ) p g p f g dµ = 0, }{{} 0 which implies that p f p + p g p f g = 0 µ-almost everywhere in. The equality occurs in Young s inequality if and only if f p = g p everywhere in. Thus µ-almost f (x) p = f p p f p p g(x) p for µ-almost every x. W R N I N G : f L p () and g L p () does not imply that f g L p (). Reason. Let f : (0,) R, f (x) = x, g = f, and assume that µ is the Lebesgue measure. Then f L ((0,)) and g L ((0,)), but (f g)(x) = f (x)g(x) = and f g L ((0,)). x

CHPTER. L P SPCES 0 Remarks.7: () For p = 2 we have Schwarz s inequality ( ) /2 ( /2 f g dµ f 2 dµ g dµ) 2. (2) Hölder s inequality holds for arbitrary measurable functions with the interpretation that the integrals may be infinite. (Exercise) Lemma.8 (Jensen s inequality). Let p < q < and assume that 0 < µ() <. Then T H E ( µ() /p ( f dµ) p µ() /q f dµ) q. M O R L : The integral average is an increasing function of the power. Proof. By Hölder s inequality ( f p dµ f pq/p dµ ( = ) p/q ( ) (q p)/q q/(q p) dµ p/q f dµ) q µ() p/q. Remark.9. If p < q < and µ() <, then L q () L p (). W R N I N G : Let p < q <. In general, L q () L p () or L p () L q (). Reason. Let f : (0, ) R, f (x) = x a and assume that µ is the Lebesgue measure. Then f L ((0,)) a > and f L ((, )) a <. ssume that p < q <. Choose b such that /q b /p. Then the function x b χ (0,) (x) belongs to L p ((0, )), but does not belong to L q ((0, )). On the other hand, the function x b χ (, ) (x) belongs to L q ((0, )), but does not belong to L p ((0, )). Examples.20: () Let = (0,), µ be the Lebesgue measure and p <. Define f : (0,) R, f (x) = x /p (log(2/x)) 2/p. Then f L p ((0,)), but f L q ((0,)) for any q > p. Thus for every p with p <, there exists a function f which belongs to L p ((0,)), but does not belong to any higher L q ((0,)) with q > p. (Exercise) (2) Let p < q <. ssume that contains µ-measurable sets of arbitrarily small positive measure. Then there are pairwise disjoint µ-measurable sets i, i =,2,..., such that µ( i ) > 0 and µ( i ) 0 as i. Let f = a i χ i,

CHPTER. L P SPCES where a i are chosen so that a q i µ( i) = and a p i µ( i) <. Then f L p () \ L q (). It can be shown, that L p () is not contained in L q () if and only if contains measurable sets of arbitrarily small positive measure. (Exercise) (3) Let p < q <. ssume that contains µ-measurable sets of arbitrarily large measure. Then there are pairwise disjoint µ-measurable sets i, i =,2,..., such that µ( i ) > 0 and µ( i ) as i. Let f = a i χ i, where a i 0 are chosen so that a q i µ( i) < and a p i µ( i) =. Then f L q () \ L p (). It can be shown, that L q () is not contained in L p () if and only if contains measurable sets of arbitrarily large measure. (Exercise) Remark.2. There is a more general version of Jensen s inequality. ssume that 0 < µ() <. Let f L () such that a < f (x) < b for every x. If ϕ is a convex function on (a, b), then ϕ ( µ() f dµ ) ϕ f dµ. µ() The cases a = and b = are not excluded. Observe, that in this case may happen that ϕ f is not integrable. We leave the proof as an exercise. Theorem.22 (Minkowski s inequality). ssume p < and f, g L p (). Then f + g L p () and f + g p f p + g p. Moreover, an equality occurs if and only if there exists a positive constant c such that f (x) = cg(x) for µ-almost every x. T H E M O R L : Minkowski s inequality is the triangle inequality for the L p - norm. It implies that the L p norm, with p <, is a norm in the usual sense and that L p () is a normed space if the functions that coincide almost everywhere are identified.

CHPTER. L P SPCES 2 Remark.23. Elementary inequalities (.3) and (.4) imply that ( ) /p ( ) /p f + g p = f + g p dµ 2 (p )/p ( f p + g p ) dµ ( ) ) /p ( ( /p 2 (p )/p f dµ) p + g p dµ = 2 (p )/p ( f p + g p ), p <. Observe that the factor 2 (p )/p is strictly greater than one for p > and Minkowski s inequality does not follow from this. Proof. p = : The triangle inequality, as in the proof of Lemma.7, shows that f + g f + g. < p < : If f + g p = 0, there is nothing to prove. Thus we may assume that f + g p > 0. Then by Hölder s inequality f + g p dµ f + g p f dµ + f + g p g dµ ( f + g p = f + g p f + g f + g p ( f + g )) ( ) /p ( ) /p f + g (p )p dµ f p dµ ( ) /p ( + f + g (p )p dµ /p g dµ) p. Since f + g p > 0, we have ( (p )/p ( /p ( /p f + g dµ) p f dµ) p + g dµ) p. It remains to consider when the equality can occur. This happens if there is an equality in the pointwise inequality f + g p = f + g p f + g f + g p ( f + g ) as well as equality in the applications of Hölder s inequality. Equality occurs in in the pointwise inequality above if f (x) and g(x) have the same sign. On the other hand, equality occurs in Hölder s inequality if c f (x) p = f (x) + g(x) p = c 2 g(x) p for µ almost every x. This completes the proof. Note that the normed space L p (), p <, is a metric space with the metric d(f, g) = f g p.

CHPTER. L P SPCES 3.3 L p spaces for 0 < p < It is sometimes useful to consider L p spaces for 0 < p <. Observe that Definition. makes sense also when 0 < p < and the space is a vector space by the same argument as in the proof of Lemma.7. However, f p is not a norm if 0 < p <. Reason. Let f = χ [0, 2 ) and g = χ [ 2,]. Then f + g = χ [0,] so that f + g p =. On the other hand, f p = 2 /p and g p = 2 /p. Thus f p + g p = 2 2 /p = 2 /p <, when 0 < p <. This shows that f p + g p < f + g p. Thus the triangle inequality does not hold true when 0 < p <, but we have the following result. Lemma.24. If f, g L p () and 0 < p <, then f + g L p () and f + g p p f p p + g p p. Proof. The elementary inequality (a + b) p a p + b p, a, b 0, 0 < p <, (.4) implies f + g p p = f + g p dµ f p dµ + g p dµ = f p p + g p p. However, L p () is a metric space with the metric d(f, g) = f g p p = f g p dµ This metric is not induced by a norm, since f p p does not satisfy the homogeneity required by the norm. On the other hand, f p satisfies the homogeneity, but not the triangle inequality. Remarks.25: () By (.2), we have f + g p ( f p p + g p p) /p 2 /p ( f p + g p ), 0 < p <. Thus a quasi triangle inequality holds with a multiplicative constant. (2) If f, g L p (), f 0, g 0, then f + g p f p + g p, 0 < p <. This is the triangle inequality in the wrong direction (exercise).

CHPTER. L P SPCES 4 Remark.26. It is possible to define the L p spaces also when p < 0. µ-measurable function is in L p () for p < 0, if 0 < f p dµ = f p /( p ) dµ <, where p + p =. If f L p () for p < 0, then f 0 µ-almost everywhere and f < µ-almost everywhere. However, this is not a vector space..4 Completeness of L p Next we shall prove a famous theorem, which is not only important in the integration theory, but has a great historical interest as well. The result was found independently by F. Riesz and E. Fisher in 907, primarily in connection with the Fourier series which culminates in showing the completeness of L 2. Recall that a sequence (f i ) of functions f i L p (), i =,2,..., converges in L p () to a function f L p (), if for every ε > 0 there exists i ε such that f i f p < ε when i i ε. Equivalently, lim f i f p = 0. i sequence (f i ) is a Cauchy sequence in L p (), if for every ε > 0 there exists i ε such that f i f j p < ε when i, j i ε. W R N I N G : This is not the same condition as f i+ f i p < ε when i i ε. Indeed, the Cauchy sequence condition implies this, but the converse is not true (exercise). C L I M : If f i f in L p (), then (f i ) is Cauchy sequence in L p (). Reason. By Minkowski s inequality f i f j p f i f p + f f j p < ε when i and j are sufficiently large. Theorem.27 (Riesz-Fischer). If (f i ) is a Cauchy sequence in L p (), p <, then there exists f L p () such that f i f in L p () as i. T H E M O R L : L p (), p <, is a Banach space with the norm p. In particular, L 2 () is a Hilbert space.

CHPTER. L P SPCES 5 Proof. ssume that (f i ) is a Cauchy sequence in L p (). We choose a subsequence as follows. Choose i such that f i f j p < 2 when i, j i. We continue recursively. Suppose that i, i 2,..., i k have been chosen such that Then choose i k+ > i k such that For the subsequence (f ik ), we have Define Then g l = f i f j p < 2 k when i, j i k. f i f j p < 2 k+ when i, j i k+. k= f ik f ik+ p < 2 k, k =,2,... l f ik+ f ik and g = f ik+ f ik. lim g l = lim l l k= k= l f ik+ f ik = f ik+ f ik = g and as a limit of µ-measurable functions g is a µ-measurable function. Fatou s lemma and Minkowski s inequality imply ( /p ( ) /p g dµ) p liminf g p l l dµ l ( /p liminf f ik+ f ik dµ) p l 2 k =. k= Thus g L p () and consequently g(x) < for µ-almost every x. It follows that the series k= f i (x) + (f ik+ (x) f ik (x)) k= converges absolutely for µ-almost every x. Denote the sum of the series by f (x) for those x at which it converges and set f (x) = 0 in the remaining set of measure zero. Then f (x) = f i (x) + (f ik+ (x) f ik (x)) = lim l ( k= ) l f i (x) + (f ik+ (x) f ik (x)) k= = lim l f il (x) = lim k f ik (x) for µ-almost every x. Thus there is a subsequence which converges µ-almost everywhere in. Next we show that the original sequence converges to f in L p (). C L I M : f i f in L p () as i. k=

CHPTER. L P SPCES 6 Reason. Let ε > 0. Since (f i ) is a Caucy sequence in L p (), there exists i ε such that f i f j p < ε when i, j i ε. For a fixed i, we have f ik f i f f i µ-almost everywhere in as i k. By Fatou s lemma ( /p ( /p f f i dµ) p liminf f ik f i dµ) p ε. k This shows that f f i L p () and thus f = (f f i ) + f i L p (). Moreover, for every ε > 0 there exists i ε such that f i f p < ε when i i ε. This completes the proof. W R N I N G : In general, if a sequence has a converging subsequence, the original sequence need not converge. In the proof above, we used the fact that we have a Cauchy sequence. We shall often use a particular part of the proof of the Riesz-Fisher theorem, which we now state. Corollary.28. If f i f in L p (), then there exist a subsequence (f ik ) such that lim f i k (x) = f (x) µ-almost every x. k Proof. The proof of the Riesz-Fischer theorem gives a subsequence (f ik ) and a function g L p () such that lim f i k (x) = g(x) µ-almost every x k and f ik g in L p (). On the other hand, f i f in L p (), which implies that f ik f in L p (). By the uniqueness of the limit, we conclude that f = g µ-almost everywhere in. Remarks.29: Let us compare the various modes of convergence of a sequence (f i ) of functions in L p (). () If f i f in L p (), then lim f i p = f p. i Reason. f i p = f i f + f p f i f p + f p implies f i p f p f i f p. In the same way f p f i p f i f p. Thus from which it follows that f i p f p f i f p 0, lim f i p = f p. i

CHPTER. L P SPCES 7 (2) f i f in L p () implies that f f in measure. Reason. By Chebyshev s inequality µ({x : f i (x) f (x) ε}) ε p f i f p dµ = ε p f i f p 0 as i. (3) If f i f in L p (), then there exist a subsequence (f ik ) such that lim f i k (x) = f (x) µ-almost every x. k Reason. The convergence in measure implies the existence of an almost everywhere converging subsequence. This gives another proof of the previous corollary (4) In the case p =, f i f in L () implies not only that lim f i dµ = f dµ i but also that lim f i, dµ = f dµ. i Reason. (f i f ) dµ f i f dµ = f i f 0 as i. T H E M O R L : This is a useful tool in showing that a sequence does not converge in L p. Example.30. Let f i = χ [i,i), i =,2,..., and f = 0. Lebesgue measure. Then ssume that µ is the lim f i(x) = f (x) for every x R. i However, f i p = for every i =,2,... and f p = 0. Thus the sequence (f i ) does not converge to f in L p (R), p <. Example.3. In the following examples we assume that µ is the Lebesgue measure. () f i f almost everywhere does not imply f i f in L p. Let Then R f i p dx = i 2p f i = i 2 χ ( 0, ), i =,2,... i R χ p (0, i ) dx = i2p i = i2p <. Thus f i L p (R), p <, f i (x) 0 for every x R, but f i p = i 2 p i as i. Thus (f i ) does not converge in L p (R).

CHPTER. L P SPCES 8 (2) f i f in L p does not imply f i f almost everywhere. Consider the sliding sequence of functions f 2 k + j = kχ[ j 2 k, j+ 2 k ], k = 0,,2,..., j = 0,,2,...,2 k. Then f 2 k + j p = k2 k p 0 as k which implies that f i 0 in L p (R), p <, as i. However, the sequence (f i (x)) fails to converge for every x [0,], since limsup f i (x) = and liminf f i(x) = 0 i i for every x [0,]. Note that there are many converging subsequences. For example, f 2 k + (x) 0 for every x [0,] as k. (3) sequence can converge in L p without converging in L q. Consider f i = i χ (i,2i), i =,2,... Then f i p = i +/p. Thus f 0 in L p (R), < p <, but f i = for every i =,2..., so that the sequence (f i ) does not converge in L (R n ). The following theorem clarifies the difference between the pointwise convengence and L p -convergence. Theorem.32. ssume that f i L p (), i =,2,... and f L p (), p <. If f i f µ-almost everywhere in and lim i f i p = f p, then f i f in L p () as i. Proof. Since f i < and f < µ-almost everywhere in, by (.2), we have 2 p ( f i p + f p ) f i f p 0 µ-almost everywhere in. The assumption f i f µ-almost everywhere in implies lim i (2p ( f i p + f p ) f i f p ) = 2 p+ f p µ-almost everywhere in. pplying Fatou s lemma, we obtain 2 p+ ( f dµ liminf 2 p ( f i p + f p ) f i f p) dµ i ( ) liminf 2 p f i p dµ + 2 p f p dµ f i f p dµ i = lim 2 p f i p dµ + 2 p f p dµ limsup f i f p dµ i i = 2 p f p dµ + 2 p f p dµ limsup i f i f p dµ.

CHPTER. L P SPCES 9 Here we used the facts that if (a i ) is a converging sequence of real numbers and (b i ) is an arbitrary sequence of real numbers, then liminf (a i + b i ) = lim a i + liminf b i and liminf ( b i) = limsup b i. i i i i i Subtracting 2p+ f p dµ from both sides, we have limsup f i f p dµ 0. i On the other hand, since the integrands are nonnegative limsup f i f p dµ 0. i Thus lim f i f p dµ = 0. i.5 L space The definition of the L space differs substantially from the definition of the L p space for p <. The main difference is that instead of the integration the definition is based on the almost everywhere concept. The class L consists of bounded measurable functions with the interpretation that we neglect the behaviour of the functions on a set of measure zero. Definition.33. Let R n be a µ-measurable set and f : [, ] a µ- measurable function. Then f L (), if there exists M, 0 M <, such that f (x) M for µ-almost every x. Functions in L are sometimes called essentially bounded functions. If f L (), then the essential supremum of f is esssup f (x) = inf{m : f (x) M for µ-almost every x } x = inf { M : µ({x : f (x) > M}) = 0 } and the essential infimum of f is The L norm of f is essinf f (x) = sup{m : f (x) m for µ-almost every x } x = sup { m : µ({x : f (x) < m}) = 0 }. f = esssup f (x). x It is clear that f L () if and only if f <.

CHPTER. L P SPCES 20 T H E M O R L : f is the supremum outside sets of measure zero. Observe that the standard supremum of a bounded function f is W R N I N G : sup f (x) = inf{m : {x : f (x) > M} = }. x The L p norm for p < depends on the average size of the function, but L norm depends on the pointwise values of the function outside a set of measure zero. More precisely, the L p norm for p < depends very much on the underlying measure µ and would be very sensitive to any changes in µ. The L depends only on the class of sets of µ measure zero and not on the distribution of the measure µ itself. Remark.34. In the special case that = N and µ is the counting measure, the L (N) space is denoted by l and { } l = (x i ) : sup x i < i N. Here (x i ) is a sequence of real (or complex) numbers. Thus l is the space of bounded sequences. Example.35. ssume that µ is the Lebesgue measure. () Let f : R R, f (x) = χ Q (x). Then f = 0, but sup x R f (x) =. (2) Let f : R n R, f (x) = / x. Then f L (R n ). Remarks.36: () f sup x f (x). (2) Let f L (). Then for every ε > 0, we have µ({x : f (x) > f + ε}) = 0 and µ({x : f (x) > f ε}) > 0. (3) If f C() and µ() > 0, then f = sup x f (x). (Exercise) Lemma.37. ssume that f L (). Then () f (x) esssup x f (x) for µ-almost every x and (2) f (x) essinf x f (x) for µ-almost every x. T H E M O R L : In other words, f (x) f for µ-almost every x. This means that if f L, there exists a smallest number number M such that f (x) M for µ-almost every x. This smallest number is f. Proof. () For every i =,2,... there exists M i 0 such that M i < f + i and f (x) M i for µ-almost every x.

CHPTER. L P SPCES 2 Thus there exists N i with µ(n i ) = 0 such that f (x) M i for every x \ N i. Let N = N i. Then µ(n) µ(n i) = 0. Observe that ( \ N i ) = \ N i = \ N. Then f (x) M i < f + i for every x \ N, i =,2,... Letting i, we obtain f (x) f for every x \ N. (2) (Exercise) Lemma.38 (Minkowski s inequality for p = ). If f, g L (), then f + g f + g. Proof. By Lemma.37 f (x) f for µ-almost every x and g(x) g for µ-almost every x. Thus f (x) + g(x) f (x) + g(x) f + g for µ-almost every x. By the definition of the L norm, we have f + g f + g. T H E M O R L : This is the triangle inequality for the L -norm. It implies that the L norm is a norm in the usual sense and that L () is a normed space if the functions that coincide almost everywhere are identified. Theorem.39 (Hölder s inequality for p = and p = ). If f L () ja g L (), then f g L () f g g f. T H E M O R L : In practice, we take the essential supremum out of the integral. Proof. By Lemma.37, we have g(x) g for µ-almost every x. This implies f (x)g(x) g f (x) for µ-almost every x and thus f (x)g(x) dµ g f. Remark.40. There is also an L p version f g p g f p of the previous theorem. Next result justifies the notation f. Theorem.4. If f L p () for some p <, then lim f p = f. p

CHPTER. L P SPCES 22 T H E M O R L : In this sense, L () is the limit of L p () spaces as p. Moreover, this gives a useful method to show that f L : It is enough find a uniform bound for the L p norms as p. Proof. Denote λ = {x : f (x) > λ}, λ 0. Suppose 0 λ < f. By the definition of the L norm, we have µ( λ ) > 0. By Chebyshev s inequality ( ) f p µ( λ ) dµ = λ λ p f p dµ < and thus f p λµ( λ ) /p. Since 0 < µ( λ ) <, we have µ( λ ) /p as p. This implies By letting λ f, we have liminf p f p λ whenever 0 λ < f. liminf p f p f. On the other hand, for q < p <, we have ( /p ( ) /p f p = f dµ) p = f q f p q dµ f q/p f q q/p. Since f q < for some q, this implies We have shown that limsup f p f. p which implies that the limit exists and limsup f p f liminf f p, p p lim f p = f. p Remarks.42: () The assumption f L p () for some p < can be replaced with the assumption µ() <. (2) Recall that by Jensen s inequality, the integral average is an increasing function of p. ( µ() ) /p f p dµ (3) If 0 < µ() <, then for every µ-measurable function ( lim p µ() /p f dµ) p = esssup f,

CHPTER. L P SPCES 23 and ( lim p 0 µ() ( lim p µ() /p f dµ) p = essinf f /p ( f dµ) p = exp µ() ) log f dµ. Theorem.43. L () is a Banach space. T H E M O R L : The claim and proof is the same as in showing that the space of continuous functions with the supremum norm is complete. The only difference is that we have to neglect sets of zero measure. Proof. Let (f i ) be a Cauchy sequence in L (). By Lemma.37, we have f i (x) f j (x) f i f j for µ-almost every x. Thus there exists N i, j, µ(n i, j ) = 0 such that f i (x) f j (x) f i f j for every x \ N i, j. Since (f i ) is a Cauchy sequence in L (), for every k =,2,..., there exists i k such that f i f j < when i, j i k. k This implies f i (x) f j (x) < k for every x \ N i, j, i, j i k. Let N = j= N i, j. Then µ(n) µ(n i, j ) = 0 j= and f i (x) f j (x) < k for every x \ N, i, j i k. Thus (f i (x)) is a Cauchy sequence for every x \ N. Since R is complete, there exists We set f (x) = 0, when x N. lim f i(x) = f (x) for every x \ N. i Then f is measurable as a pointwise limit of measurable functions Letting j in the preceding inequality gives f i (x) f (x) k for every x \ N i, j, i, j i k, which implies f i f k when i i k. Since f f i + f i f <, we have f L () and f i f in L () as i.

CHPTER. L P SPCES 24 Remark.44. The proof shows that f i f in L () as i implies that f i f uniformly in \ N with µ(n) = 0. Example.45. ssume that µ is the Lebesgue measure. Let f i : R R, 0, x (,0), f i (x) = ix, x [ 0, ] i,, x ( i, ), for i =,2,... and let f = χ [0, ). Then f i (x) f (x) for every x R as i, f i = for every i =,2,..., f = so that lim i f i = f, but f i f = for every i =,2,... Thus lim i f i f = 0. This shows that the claim of Theorem.32 does not hold when p =.

The Hardy-Littlewood maximal function is a very useful tool in analysis. The maximal function theorem asserts that the maximal operator is bounded from L p to L p for p > and for p = there is a weak type estimate. The weak 2 type estimate is used to prove the Lebesgue differentiation theorem, which gives a pointwise meaning for a locally integrable function. The Lebesgue differentiation theorem is a higher dimensional version of the fundamental theorem of calculus. It is applied to the study of the density points of a measurable set. s an application we prove a Sobolev embedding theorem. The Hardy-Littlewood maximal function In this section we restrict our attention to the Lebesgue measure on R n. We prove Lebesgue s theorem on differentiation of integrals, which is an extension of the one-dimensional fundamental theorem of calculus to the n-dimensional case. This theorem states that, for a (locally) integrable function f : R n [, ], we have lim f (y) d y = f (x) r 0 B(x, r) B(x,r) for almost every x R n. Recall that B(x, r) = {y R n : y x < r} is the open ball with the center x and radius r > 0. In proving this result we need to investigate very carefully the behaviour of the integral averages above. This leads to the Hardy-Littlewood maximal function, where we take the supremum of the integral averages instead of the limit. The passage from the limiting expression to a corresponding maximal function is a situation that occurs often. Hardy and Littlewood wrote that they were led to study the one-dimensional version of the maximal function by the question how a score in cricket can be maximized: The problem is most easily grasped when stated in the language of cricket, or any other game in which the player complies a series of scores of which average is recorded. s we shall see, these concepts and methods have a universal significance in analysis. 2. Local L p spaces If we are interested in pointwise properties of functions, it is not necessary to require integrablity conditions over the whole underlying domain. For example, in the Lebesgue differentiation theorem above, the limit is taken over integral 25

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 26 averages over balls that shrink to the point x, so that the behaviour of f far away from x is irrelevant. Definition 2.. Let Ω R n be an open set and assume that f : Ω [, ] is a measurable function. Then f L p (Ω), if loc f p dx <, p <, and esssup f <, p = K for every compact set K Ω. Examples 2.2: L p (Ω) L p (Ω), but the reverse inclusion is not true. loc K () Let f : R n R, f (x) =. Then f L p (R n ) for any p <, but f L p loc (Rn ) for every p <. (2) Let f : R n R, f (x) = x /2. Then f L (R n ), but f L loc (Rn ). (3) Let f : R n R, f (x) = e x. Then f L (R n ), but f L loc (Rn ). (4) Let f : B(0,)\{0} R, f (x) = x n/p. Then f L p (B(0,)\{0}) for p < n, but f L p loc (B(0,) \ {0}) for < p <. Moreover, f L (B(0,) \ {0}), but f L (B(0,) \ {0}). loc (5) For p =, let f : R n R, f (x) = x. Then f L (R n ), but f L loc (Rn ). Remarks 2.3: () If p q, then L loc (Ω) Lq loc (Ω) Lp loc (Ω) L loc (Ω). Reason. By Jensen s inequality ( /p ( f dx f dx) p K K K K where K is a compact subset of Ω with K > 0. K K /q f dx) q esssup f, K (2) C(Ω) L p (Ω) for every p. loc Reason. Since f C(Ω) assumes its maximum in the compact set K and K has a finite Lebesgue measure, we have f p dx K (esssup f ) p K (max f K K K )p <. (3) f L p loc (Rn ) f L p (B(0, r)) for every 0 < r < f L p () for every bounded measurable set R n. (4) In general, the quantity ( ) /p sup f p dx K R n K is not a norm in L p loc (Rn ), since it may be infinity for some f L p loc (Rn ). Consider, for example, constant functions on R n.

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 27 2.2 Definition of the maximal function We begin with the definition of the maximal function. Definition 2.4. The centered Hardy-Littlewood maximal function M f : R n [0, ] of f L loc (Rn ) is defined by M f (x) = sup r>0 B(x, r) B(x,r) f (y) d y, where B(x, r) = {y R n : y x < r} is the open ball with the radius r > 0 and the center x R n. T H E M O R L : M f (x) gives the maximal integral average of the absolute value of the function on balls centered at x. Remarks 2.5: () It is enough to assume that f : R n [, ] is a measurable function in the definition of the Hardy-Littlewood maximal function. The assumption f L loc (Rn ) guarantees that the integral averages are finite. (2) M f is defined at every point x R n. If f = g almost everywhere in R n, then M f (x) = M g(x) for every x R n. (3) It may happen that M f (x) = for every x R n. For example, let f : R n R, f (x) = x. Then M f (x) = for every x R n. (4) There are several seemingly different definitions, which are comparable. Let M f (x) = sup B x B B f (y) d y be the noncentered maximal function, where the supremum is taken over all open balls B containing the point x R n, then M f (x) M f (x) for every x R n. On the other hand, if B = B(z, r) x, then B(z, r) B(x,2r) and f (y) d y B(x,2r) f (y) d y B B B(z, r) B(x, 2r) B(x,2r) = 2 n f (y) d y B(x, 2r) 2 n M f (x). B(x,2r) This implies that M f (x) 2 n M f (x) and thus M f (x) M f (x) 2 n M f (x) for every x R n. (5) It is possible to use cubes in the definition of the maximal function and this will give a comparable notion as well.

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 28 Examples 2.6: () Let f : R R, f = χ [a,b]. Then M f (x) =, if x (a, b). For x b a calculation shows that the maximal average is obtained when r = x a. Similarly, when x a, the maximal average is obtained when r = b x. Thus b a 2 x b, x a, M f (x) =, x (a, b), b a 2 x a, x b. Note that the centered maximal function M f has jump discontinuities at x = a and x = b. T H E M O R L : f L (R) does not imply M f L (R). (2) Consider the noncenter maximal function M f of f : R R, f = χ [a,b]. gain M f (x) =, if x (a, b). For x > b a calculation shows that the maximal average over all intervals (z r, z + r) is obtained when z = (x + a)/2 and r = (x a)/2. Similarly, when x < a, the maximal average is obtained when z = (b + x)/2 and r = (b x)/2. Thus b a x b, x a, M f (x) =, x (a, b), b a x a, x b. Note that the uncentered maximal function M f does not have discontinuities at x = a and x = b. Lemma 2.7. If f C(R n ), then f (x) M f (x) for every x R n. T H E M O R L : This justifies the terminology, since the maximal function is pointwise bigger or equal than the absolute value of the original function. Proof. ssume that f C(R n ) and let x R n. Then for every ε > 0 there exists δ > 0 such that f (x) f (y) < ε if x y < δ. This implies f (y) d y f (x) B(x, r) = ( f (y) f (x) ) d y B(x,r) B(x, r) B(x,r) f (y) f (x) d y B(x, r) B(x,r) f (y) f (x) d y ε, if r δ. B(x, r) B(x,r) Thus f (x) = lim f (y) d y M f (x) for every x R n. r 0 B(x, r) B(x,r) The next thing we would like to show is that M f : R n [0, ] is a measurable function. Recall that a function f : R n [, ] is lower semicontinuous, if

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 29 the distribution set {x R n : f (x) > λ} is open for every λ R. Since open sets are Lebesgue measurable, it follows that every lower semicontinuous function is Lebesgue measurable. Lemma 2.8. M f is lower semicontinuous. Proof. Let λ = {x R n : M f (x) > λ}, λ > 0. For every x λ there exists r > 0 such that B(x, r) B(x,r) f (y) d y > λ. By the properties of the integral f (y) d y = lim B(x, r) B(x, r ) B(x,r) r r r >r which implies that there exists r > r such that B(x, r f (y) d y > λ. ) B(x,r) B(x,r) f (y) d y, If x x < r r, then B(x, r) B(x, r ), since y x y x + x x < r+(r r) = r for every y B(x, r). Thus λ < B(x, r ) = B(x, r ) B(x,r) B(x,r ) f (y) d y B(x, r ) B(x,r ) f (y) d y f (y) d y M f (x ), if x x < r r. This shows that B(x, r r) λ and thus λ is an open set. 2.3 Hardy-Littlewood-Wiener maximal function theorems nother point of view is to consider the Hardy-Littlewood maximal operator f M f. We shall list some properties of this operator below. Lemma 2.9. ssume that f, g L loc (Rn ). () (Positivity) M f (x) 0 for every x R n. (2) (Sublinearity) M(f + g)(x) M f (x) + M g(x). (3) (Homogeneity) M(af )(x) = a M f (x), a R. (4) (Translation invariance) M(τ y f )(x) = (τ y M f )(x), y R n, where τ y f (x) = f (x + y). Proof. Exercise.

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 30 We are interested in behaviour of the maximal operator in L p -spaces. The following results were first proved by Hardy and Littlewood in the one-dimensional case and extended later by Wiener to the higher dimensional case. Lemma 2.0. If f L (R n ), then M f L (R n ) and M f f. T H E M O R L : The maximal function is essentially bounded, and thus finite almost everywhere, if the original function is essentially bounded. Intuitively this is clear, since the integral averages cannot be bigger than the essential supremum of the function. Proof. For every x R n and r > 0 we have f (y) d y B(x, r) B(x, r) f B(x, r) = f. B(x,r) Thus M f (x) f for every x R n and M f f. nother way to state the previous lemma is that M : L (R n ) L (R n ) is a bounded operator. s we have seen before, f L (R) does not imply that M f L (R) and thus the Hardy-Littlewood maximal operator is not bounded in L (R n ). We give another example of this phenomenon. Example 2.. Let r > 0. Then there are constants c = c (n) and c 2 = c 2 (n) such that c r n ( x + r) n M(χ B(0,r))(x) c 2r n ( x + r) n for every x R n (exercise). Since these functions do not belong to L (R n ), we see that the Hardy-Littlewood maximal operator does not map L (R n ) to L (R n ). Next we show even a stronger result that M f L (R n ) for every nontrivial f L loc (Rn ). Remark 2.2. M f L (R n ) implies f = 0. Reason. Let r > 0 and let x R n such that x r. Then M f (x) f (y) d y B(x, 2 x ) B(x,2 x ) f (y) d y B(0, 2 x ) B(0,r) (B(0, r) B(x,2 x ), y < r = y x y + x < r + x 2 x ) = c x n f (y) d y. B(0,r) If f 0, we choose r > 0 large enough that f (y) d y > 0. B(0,r) Then M f (x) c/ x n for every x R n \ B(0, r). Since c/ x n L (R n \ B(0, r)) we conclude that M f L (R n ). This is a contradiction and thus f = 0 almost everywhere.

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 3 The remark above shows that the maximal function is essentially never in L, but the essential issue for this is what happens far away from the origin. The next example shows that the maximal function does not need to be even locally in L. Example 2.3. Let f : R R, Then f L (R), since For 0 < x < /2, we have Thus M f L loc (R). R /2 f (x) dx = 0 f (x) = χ (0,/2)(x) x(log x) 2. 2x x(log x) 2 dx = /2 0 x log x <. M f (x) f (y) d y f (y) d y 2x 0 2x 0 = x 2x 0 y(log y) 2 d y = x 2x 0 log y = 2xlog x L ((0,/2)). fter these considerations, the situation for L boundedness looks rather hopeless. However, there is a substituting result, which says that if f L, then M f belongs to a weakened version of L. Definition 2.4. measurable function f : R n [, ] belongs to weak L (R n ), if there exists a constant c, 0 c <, such that {x R n : f (x) > λ} c λ for every λ > 0. Remarks 2.5: () L (R n ) weak L (R n ). Reason. By Chebyshev s inequality {x R n : f (x) > λ} λ {x R n : f (x) >λ} f (y) d y λ f for every λ > 0. (2) Weak L (R n ) L (R n ). Reason. Let f : R n [0, ], f (x) = x n. Then f L (R n ), but {x R n : f (x) > λ} = B(0,λ n ) = Ω n (λ n ) n = Ω n λ for every λ > 0. Here Ω n = B(0,). Thus f belongs to weak L (R n ).

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 32 The next goal is to show that the Hardy-Littlewood maximal operator maps L to weak L. The proof is based on the extremely useful covering theorem. Theorem 2.6 (Vitali covering theorem). Let F be a collection of open balls B such that diam ( B B F ) <. Then there is a countable (or finite) subcollection of pairwise disjoint balls B(x i, r i ) F, i =,2,..., such that T H E B B(x i,5r i ). B F M O R L : Let be a bounded subset of R n and suppose that for every x there is a ball B(x, r x ) with the radius r x > 0 possibly depending on the point x. We would like to have a countable subcollection of pairwise disjoint balls B(x i, r i ), i =,2,..., which covers the union of the original balls. In general, this is not possible, if we do not expand the balls. Thus B(x, r x ) B(x i,5r i ) B(x i,5r i ) x = 5 n B(x i, r i ) = 5 n B(x i, r i ) 5n B(x, r x ). Note the measure of can be estimated by the measure of the union of the balls and the measures of x B(x, r x ) and B(x i, r i ) are comparable. T H E S T R T E G Y O F T H E P R O O F : The greedy principle: The balls are selected inductively by taking the largest ball with the required properties that has not been chosen earlier. Proof. ssume that B(x, r ),...,B(x i, r i ) F have been selected. Define { } i d i = sup r : B(x, r) F and B(x, r) B(x j, r j ) =. Observe that d i <, since that j= x sup r <. If there are no balls B(x, r) F such B(x,r) F i B(x, r) B(x j, r j ) =, j= the process terminates and we have selected the balls B(x, r ),...,B(x i, r i ). Otherwise, we choose B(x i, r i ) F such that r i > i 2 d i and B(x i, r i ) B(x j, r j ) =. We can also choose the first ball B(x, r ) in this way. j=

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 33 The selected balls are pairwise disjoint. Let B F be an arbitrary ball in the collection F. Then B = B(x, r) intersects at least one of the selected balls B(x, r ),B(x 2, r 2 ),..., since otherwise B(x, r) B(x i, r i ) = for every i =,2,... and, by the definition of d i, we have d i r for every i =,2,... This implies r i > 2 d i r > 0 for every i =,2,..., 2 and by the fact that the balls are pairwise disjoint, we have B(x i, r i ) = B(x i, r i ) =. This is impossible, since B(x i, r i ) is bounded and thus B(x i, r i ) <. Since B(x, r) intersects some ball B(x i, r i ), i =,2,..., there is a smallest index i such that B(x, r) B(x i, r i ). This implies i B(x, r) B(x j, r j ) = j= and by the selection process r d i < 2r i. Since B(x, r) B(x i, r i ) and r 2r i, we have B(x, r) B(x i,5r i ). Reason. Let z B(x, r) B(x i, r i ) and y B(x, r). Then y x i y z + z x i 2r + r i 5r i. This completes the proof. Remarks 2.7: () The factor 5 in the theorem is not optimal. In fact, the same proof shows that this factor can be replaced with 3. slight modification of the argument shows that 2 + ε for any ε > 0 will do. To obtain this, choose B(x i, r i ) F such that r i > + ε d i in the proof above. This is the optimal result, since 2 does not work in general (exercise). (2) similar covering theorem holds true for cubes as well. (3) Some kind of boundedness assumption is needed in the Vitali covering theorem. Reason. Let B(0, i), i =,2,... Since all balls intersect each other, the only subfamily of pairwise disjoint balls consists of one single ball B(0, i) and the enlarged ball B(0,5i) does not cover B(0.i) = Rn. Theorem 2.8 (Hardy-Littlewood I). Let f L (R n ). Then {x R n : M f (x) > λ} 5n λ f for every λ > 0.

CHPTER 2. THE HRDY-LITTLEWOOD MXIML FUNCTION 34 T H E M O R L : The Hardy-Littlewood maximal operator maps L to weak L. It is said that the Hardy-Littlewood maximal operator is of weak type (,). Proof. Let λ = {x R n : M f (x) > λ}, λ > 0. For every x λ there exists r x > 0 such that B(x, r x ) B(x,r x ) f (y) d y > λ (2.) We would like to apply the Vitali covering theorem, but the set x λ B(x, r x ) is not necessarily bounded. To overcome this problem, we consider the sets λ B(0, k), k =,2,... Let F be the collection of balls for which (2.) and x λ B(0, k). If B(x, r x ) F, then so that Ω n r n x = B(x, r x) < λ diam ( x λ B(0,k) f (y) d y B(x,r x ) λ f, B(x, r x ) ) <. By the Vitali covering theorem, we obtain pairwise disjoint balls B(x i, r i ), i =,2,..., such that λ B(0, k) B(x i,5r i ). This implies λ B(0, k) B(x i,5r i ) B(x i,5r i ) = 5 n B(x i, r i ) 5n f (y) d y = 5n f (y) d y 5n λ λ λ f. B(x i,r i ) B(x i,r i ) Finally, λ = lim k λ B(0, k) 5n λ f. Remark 2.9. f L (R n ) implies M f < almost everywhere in R n. Reason. {x R n : M f (x) = } {x R n : M f (x) > λ} 5n λ f 0 as λ. The next goal is to show that the Hardy-Littlewood maximal operator maps L p to L p if p >. We recall the following Cavalieri s principle. Lemma 2.20. ssume that µ is an outer measure, R n is µ-measurable set and f : [, ] is a µ-measurable function. Then f p dµ = p λ p µ({x : f (x) > λ}) dλ, 0 < p <. 0