Integration and Expectation

Similar documents
More metrics on cartesian products

Exercise Solutions to Real Analysis

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

Lecture 12: Discrete Laplacian

REAL ANALYSIS I HOMEWORK 1

APPENDIX A Some Linear Algebra

Appendix B. Criterion of Riemann-Stieltjes Integrability

Linear, affine, and convex sets and hulls In the sequel, unless otherwise specified, X will denote a real vector space.

BOUNDEDNESS OF THE RIESZ TRANSFORM WITH MATRIX A 2 WEIGHTS

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

One Dimension Again. Chapter Fourteen

Measure and Probability

The Order Relation and Trace Inequalities for. Hermitian Operators

Expected Value and Variance

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

princeton univ. F 17 cos 521: Advanced Algorithm Design Lecture 7: LP Duality Lecturer: Matt Weinberg

Difference Equations

MATH 5707 HOMEWORK 4 SOLUTIONS 2. 2 i 2p i E(X i ) + E(Xi 2 ) ä i=1. i=1

CSCE 790S Background Results

Some basic inequalities. Definition. Let V be a vector space over the complex numbers. An inner product is given by a function, V V C

Affine transformations and convexity

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Section 8.3 Polar Form of Complex Numbers

FINITELY-GENERATED MODULES OVER A PRINCIPAL IDEAL DOMAIN

Lecture 3: Probability Distributions

Math1110 (Spring 2009) Prelim 3 - Solutions

Math 702 Midterm Exam Solutions

Foundations of Arithmetic

Bézier curves. Michael S. Floater. September 10, These notes provide an introduction to Bézier curves. i=0

Bezier curves. Michael S. Floater. August 25, These notes provide an introduction to Bezier curves. i=0

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Errors for Linear Systems

a b a In case b 0, a being divisible by b is the same as to say that

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Linear Regression Analysis: Terminology and Notation

U.C. Berkeley CS294: Spectral Methods and Expanders Handout 8 Luca Trevisan February 17, 2016

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

THE SUMMATION NOTATION Ʃ

n α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0

MATH 241B FUNCTIONAL ANALYSIS - NOTES EXAMPLES OF C ALGEBRAS

THE WEIGHTED WEAK TYPE INEQUALITY FOR THE STRONG MAXIMAL FUNCTION

Supplement: Proofs and Technical Details for The Solution Path of the Generalized Lasso

Learning Theory: Lecture Notes

Edge Isoperimetric Inequalities

Canonical transformations

Maximizing the number of nonnegative subsets

MAT 578 Functional Analysis

STEINHAUS PROPERTY IN BANACH LATTICES

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

Inductance Calculation for Conductors of Arbitrary Shape

Numerical Heat and Mass Transfer

CALCULUS CLASSROOM CAPSULES

Strong Markov property: Same assertion holds for stopping times τ.

DIFFERENTIAL FORMS BRIAN OSSERMAN

Dirichlet s Theorem In Arithmetic Progressions

Chapter Twelve. Integration. We now turn our attention to the idea of an integral in dimensions higher than one. Consider a real-valued function f : D

Introductory Cardinality Theory Alan Kaylor Cline

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

Polynomials. 1 More properties of polynomials

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

Math 426: Probability MWF 1pm, Gasson 310 Homework 4 Selected Solutions

Solutions HW #2. minimize. Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution.

Global Sensitivity. Tuesday 20 th February, 2018

COMPUTING THE NORM OF A MATRIX

1 Matrix representations of canonical matrices

Gaussian Mixture Models

The Geometry of Logit and Probit

arxiv: v1 [math.ho] 18 May 2008

20. Mon, Oct. 13 What we have done so far corresponds roughly to Chapters 2 & 3 of Lee. Now we turn to Chapter 4. The first idea is connectedness.

MA 323 Geometric Modelling Course Notes: Day 13 Bezier Curves & Bernstein Polynomials

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 12 10/21/2013. Martingale Concentration Inequalities and Applications

Lecture 3. Ax x i a i. i i

FACTORIZATION IN KRULL MONOIDS WITH INFINITE CLASS GROUP

find (x): given element x, return the canonical element of the set containing x;

Convergence of random processes

Complete subgraphs in multipartite graphs

The Quadratic Trigonometric Bézier Curve with Single Shape Parameter

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

Dimensionality Reduction Notes 1

Lecture 17 : Stochastic Processes II

Finding Dense Subgraphs in G(n, 1/2)

Assortment Optimization under MNL

Lecture 21: Numerical methods for pricing American type derivatives

THE CHINESE REMAINDER THEOREM. We should thank the Chinese for their wonderful remainder theorem. Glenn Stevens

Salmon: Lectures on partial differential equations. Consider the general linear, second-order PDE in the form. ,x 2

COMPLEX NUMBERS AND QUADRATIC EQUATIONS

A note on almost sure behavior of randomly weighted sums of φ-mixing random variables with φ-mixing weights

Curvature and isoperimetric inequality

NP-Completeness : Proofs

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

A how to guide to second quantization method.

Chapter 3 Differentiation and Integration

Lecture 10 Support Vector Machines II

Deriving the X-Z Identity from Auxiliary Space Method

Week 5: Neural Networks

Lecture 4: September 12

Conjugacy and the Exponential Family

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

Modelli Clamfim Equazione del Calore Lezione ottobre 2014

Transcription:

Chapter 9 Integraton and Expectaton fter Lebesgue s nvestgatons, the analogy between the measure of a set and the probablty of an event, as well as between the ntegral of a functon and the mathematcal expectaton of a random varable, was clear. ndrey Kolmogorov Ths new ntegral of Lebesgue s provng tself a wonderful tool. I mght compare t wth a modern Krupp gun, so easly does t penetrate barrers whch were mpregnable. Edward van Vleck Does anyone beleve that the dfference between the Lebesgue and Remann ntegrals can have physcal sgnfcance, and that whether say, an arplane would or would not fly could depend on ths dfference? If such were clamed, I should not care to fly n that plane. Rchard Hammng panorama So far we have developed a theory for measure on a rch collecton of subsets of a gven space and a class of functons that nteract well ths noton of measure. Now we begn the study of calculus on those functons, focusng at frst on the twn concepts of ntegraton of a measurable functon and expectaton of a random varable. We recall from elementary calculus that there are two ways to thnk about ntegraton: 1. technque for computng the area underneath a curve ; 2. technque for ant-dfferentaton. Roughly speakng, the Fundamental Theorem of Calculus says these two apparently dfferent processes are the same. However, the frst nterpretaton of ntegraton s a more natural startng pont n vew of generalzng from computng area to computng measure. We make the connecton to dfferentaton later n Chapter 12. Recall that the Remann ntegral s defned as the lmt of ntegrals of pecewse constant functons defned on parttons of the doman. We defne the new concept of an 181

182 Chapter 9. Integraton and Expectaton ntegral of a measurable functon usng the lmt of approxmatng sequences of smple functons va Theorem 8.3.7. In partcular, we defne n order the ntegrals of: 1. smple functons, 2. non-negatve measurable functons and random varables, 3. general measurable functons and random varables. Steps 1 and 2 are straghtforward n concepton. We frst prove a number of (mostly famlar) elementary propertes of the ntegral of smple functons. The major part of the work n extendng ntegraton to nonnegatve measurable functons les n provng the extenson nherts all those propertes. One nterestng aspect of the new ntegral s that t treats the case of nfnte ntegrals n a completely natural way. However, theren les the complcatons for Step 3, where some care s needed to deal wth nfnte ntegrals. Once ths s dealt wth, the man work agan les n showng that ntegraton of general measurable functons nherts the propertes of ntegraton on nonnegatve measurable functons. Fnally, expectaton s smply ntegraton of random varables, so t nherts a number of propertes from ntegraton. fter establshng the basc theory of ntegraton and expectaton, we turn to provng the man results that justfy the new approach to ntegraton. Several of the major results have to do wth computng ntegrals of sequences or seres of measurable functons. Recall that a major ssue wth the Remann ntegral s that t does not work well wth lmts, e.g. the lmt of Remann ntegrable functons may not be Remann ntegrable. One of the consequences of these results s a change of varables formula whose analog n Remann ntegraton s very famlar. nother major result s that f we consder the ntegral of a nonnegatve measurable functon over a set as a functon of the set, we obtan a measure. The mmedate consequence s a number of new propertes of ntegraton. We also use ths fact to ntroduce the mportant concept of a probablty densty functon. We also show that ntegraton satsfes an ntegral verson of the famous Cauchy- Schwarz nequalty. We do not explot the mpact of ths result here, but explore t fully below. Fnally, we conclude gvng a precse descrpton of the relaton between the Remann ntegral and the new concept of ntegraton. We assume that (,,µ) s a measure space and (Ω,, P) s a probablty space. ll measurable functons are extended real-valued measurable functons mappng (,,µ) to (, ), or some restrcton, whle all r.v.are extended real valued measurable functons defned on (Ω,, P). 9.1 Smple functons and random varables We begn wth the defnton of ntegraton for smple functons: Defnton 9.1.1 Let s be a smple functon of the form, s = a χ, (9.1) =1

9.1. Smple functons and random varables 183 where {a } are real numbers and { } s a sequence of nonntersectng measurable sets such that =. The (Lebesgue) ntegral of s (wth respect to µ) s defned, s dµ = s dµ = s dµ(x) = a µ( ). (9.2) If B, the (Lebesgue) ntegral of s over B (wth respect to µ) s defned, B s dµ = sχ B dµ = s dµ(x) = B B =1 a µ( B). (9.3) The notaton dµ s used to ndcate the measure wth respect to whch the ntegral s beng computed. We use the notaton dµ(x) n stuatons n whch there s a possblty of confuson about whch measure s beng used. For example, when computng an ntegral over one dmenson of a functon of several varables or when treatng a composton of functons. In a smlar way, we ndcate the set over whch an ntegral s computed when there s possblty of confuson about whch set s ntended. Example 9.1.1 Consder the smple functons on [0,1] wth the Lebesgue measure, 1 5, s 1 (x) = 3 < x 1, 2, x s ratonal, 3, 0 x 1 3, s 2 (x) = 4, x = 1, 2 6, otherwse. =1 Then, s 1 dµ = 13 3, s 2 dµ = 6. Recall that a smple functon does not have a unque representaton. We requre ntegraton to be well posed n the followng sense. Theorem 9.1.1 The ntegral of a smple functon s ndependent of the choce of ts representaton. Ths means we can assume the standard representaton s used for smple functons. Proof. ssume that the smple functon s has two representatons, s = a χ = =1 l b j χ B j, (9.4) where { } and {B j } are sequences of nonntersectng measurable sets such that = = j B j. We prove that the ntegrals of the two representatons are equal. If B j for some, j, choose x B j. Evaluatng (9.4), we fnd that a χ (x) = a = b j = b j χ B (x). j j =1

184 Chapter 9. Integraton and Expectaton Next, we note that each can be wrtten as the dsjont unon = l j =1 B j. So, a µ( ) = a =1 =1 l l µ( B j ) = a µ( B j ). j =1 =1 j =1 The terms n the last sum n whch B j = are zero. Hence, l a µ( B j ) = b j µ( B j ) = b j =1 j =1 =1 j =1 j =1 l l l µ( B j ) = b j µ(b j ). =1 j =1 Ths proves the result. We next prove the basc propertes of ntegraton on smple functons. Theorem 9.1.2 Integraton for smple functons satsfy the followng: 1. If s a measurable set and the smple functon s(x) 0 for all x, then s dµ 0. 2. If s s a smple functon and s a measurable set, s dµ s dµ sup s µ(). 3. If s a measurable set and the smple functon s satsfes s = 0 a.e.on then s dµ = 0. 4. If s 1, s 2 are smple functons, c 1, c 2, and s a measurable set, (c 1 s 1 + c 2 s 2 ) dµ = c 1 s 1 dµ + c 2 s 2 dµ. 5. If s a measurable set and the smple functons s 1, s 2 satsfy s 1 = s 2 a.e.on, then s 1 dµ = s 2 dµ. 6. If s a measurable set and s 1, s 2 are smple functons such that s 1 s 2 a.e.n, s 1 dµ s 2 dµ.

9.1. Smple functons and random varables 185 7. If and B are dsjont measurable sets and s s a smple functon, s dµ = s dµ + s dµ. B B Several of these propertes are remnscent of famlar propertes of the Remann ntegral. Proof. Result 1 s 0 f and only f the range of s conssts of nonnegatve numbers. Therefore, the ntegral s nonnegatve. Result 2 ssume that s has the standard representaton s = k =1 b χ B wth gven by the dsjont unon = B. Then, a µ(b ) a µ(b ) max a =1 =1 =1 µ(b ) = max a µ(), by the addtvty of a measure. Result 3 pply Result 2. Result 4 Frst, we note that c 1 s 1 dµ = c 1 s 1 dµ s obvous from the defnton. So, we only need to prove that (s 1 + s 2 ) dµ = s 1 dµ + s 2 dµ. ssume that s 1 has the standard representaton s 1 = k =1 a χ wth gven by the dsjont unon = and lkewse s 2 has the standard representaton s 2 = l j =1 b j χ B j wth gven by the dsjont unon = B j. The collecton { B j, 1 k, 1 j l } s dsjont and =, j B j. Snce s 1 + s 2 = a + b j on B j, (s 1 + s 2 ) dµ = = 1 k, 1 j l 1 k, 1 j l = 1 k (a + b j )µ( B j ) a µ( B j ) + a µ( ) + 1 j l 1 k, 1 j l b j µ( B j ) = b j µ( B j ) s 1 dµ + s 2 dµ. Result 5 Ths follows from Result 4 and Result 3 appled to s 1 s 2. Result 6 Ths follows from Result 1 and Result 4 appled to s 2 s 1. Result 7 Ths follows from Result 4 after usng the fact χ B = χ + χ B. Before contnung, we provde an nterpretaton of Theorem 9.1.2. Defnton 9.1.2 n operator on a set of functons s postve f t maps the functons nto the nonnegatve real numbers.

186 Chapter 9. Integraton and Expectaton Example 9.1.2 max and mn are postve operators. Theorem 9.1.2 mples, Theorem 9.1.3 Integraton s a postve lnear operator on the set of nonnegatve smple functons. We conclude by statng the probablty space versons. Defnton 9.1.3 The expected value, mean, expectaton, or frst moment of a smple r.v. S s E(S) = S d P = Ω S d P(x). Theorem 9.1.2 mples, Theorem 9.1.4 Expectaton for smple r.v.satsfes the followng: 1. If the smple r.v. S 0 for all x, then E(S) 0. 2. If S s a smple r.v., then E(S) E( S ) sup S. 3. If the smple r.v. S = 0 a.s., then E(S) = 0. 4. If S 1, S 2 are smple r.v., and c 1, c 2, then E(c 1 S 1 + c 2 S 2 ) = c 1 E(S 1 )+ c 2 E(S 2 ). 5. If the smple r.v. S 1 = S 2 a.s., then E(S 1 ) = E(S 1 ). 6. If S 1, S 2 are smple r.v.such that S 1 S 2 for all x, E(S 1 ) E(S 2 ). Example 9.1.3 Consder a deck of cards labeled 1 through n shuffled so that any of the n! arrangements are equally lkely. Let X equal the number of cards n a shuffled deck that occupy the same poston as ther label. We compute E(X ). Let I m be the ndcator functon of the event n whch the card wth label m s n poston m. Then, X = n m=1 I m and E(I m ) = P({x : I m (x) = 1}) = (n 1)!/n! = 1/n. Hence, E(X ) = E n I m = m=1 n E(I m ) = 1. m=1 9.2 Nonnegatve measurable functons and random varables We base the defnton of ntegraton of general measurable functons on the defnton for smple functons and the approxmaton of measurable functons by smple functons. However, there s a complcaton that arses because of the potental for cancellaton that

9.2. Nonnegatve measurable functons and random varables 187 can occur f the functon n queston takes on postve and negatve values. The ssue s famlar from the standard Calculus problem of defnng the mproper Remann ntegral x 1 d x. standard mstake of a Calculus student s to wrte, M M lm x 1 d x + x 1 d x, M, m 0 m m and conclude the ntegral s 0 because the two fnte ntegrals cancel. But, the correct concluson s that the mproper ntegral s undefned. Ths s reached by treatng the two mproper ntegrals ndependently. So, we frst treat the case of nonnegatve functons. We extend to general functons usng the decomposton of measurable functons nto postve and negatve parts. Defnton 9.2.1 measurable functon f s nonnegatve f f (x) 0 a.e.n. r.v. X s nonnegatve f X (x) 0 a.s.n Ω. Now, ntegraton s defned usng the fact that a measurable functon can be approxmated from below from below by sequences of smple functons va Theorem 8.3.7. Defnton 9.2.2 If s a measurable set, the ntegral of a nonnegatve extended real valued measurable functon f over s, f dµ = f dµ(x) = sup s dµ : nonnegatve smple functons s wth s(x) f (x) all x. (9.5) f s called the ntegrand. The expected value, mean, expectaton, or frst moment of a nonnegatve extended real valued r.v. X s E(X ) = X d P = X d P(x) = sup E(S) : nonnegatve smple r.v. S wth S(x) X (x) all x Ω. Note that the ntegral or expectaton may be nfnte. The followng consstency check follows mmedately, Theorem 9.2.1 These defntons appled to smple functons and r.v. gve the same results as the orgnal defntons.

188 Chapter 9. Integraton and Expectaton Example 9.2.1 Consder f (x) = 2x mappng ([0,1], [0,1],µ ) to ([0,2], [0,2] ). We partton [0,2] nto [0, 2 m ], ( 2 m,2 2 m ],, (( 1) 2 m, 2 m ],, ((m 1) 2,2]. Then, m Hence, f 1 (( 1) 2 m, 2 m ] = (( 1) 1 m, 1 m ] and s m I = ( 1) 2 m. m s m dµ = ( 1) 2 m 1 m = 2 m = 1. m 2 =1 Nearly the same computaton shows that the ntegral of f (x) = c x, for constant c > 0, on ([a, b], [a,b],µ ) gves c 2 (b 2 a 2 ). =1 Defnton 9.2.2 mposes mnmal condtons on the smple functons used on the computaton. However, computng the sup over a large class of smple functons s mpractcal, e.g. n Example 9.2.1, we should compute the supremum of ntegrals over smple functons wth general parttons. The next result says the ntegral can be computed as a lmt of ntegrals of a sequence of smple functons that approxmate the ntegrand from below, as provded by Theorem 8.3.7 for example. Ths provdes a way to approxmate the ntegral as well! Theorem 9.2.2 Let be a measurable set and f be a nonnegatve extended real valued measurable functon. ssume that {s } s a sequence of smple functons such that 0 s 1 s 2 f and s f pontwse. Then, s dµ f dµ. Nomnally, we can use the approxmatng sequence {s } constructed n Theorem 8.3.7 to compute a numercal approxmaton of the ntegral of a gven nonnegatve measurable functon. In practce (unlke the case of the Remann ntegral whch s based on a partton of the nput doman), ths nvolves computng the measure of potentally complcated sets, see Fg. 9.1. Example 9.2.2 Consder f (x) = x mappng ([0,1], [0,1],µ ) to ([0,1], [0,1] ). We partton [0,1] nto [0, 1 m ], ( 1 m,2 1 m ],, (( 1) 1 m, 1 m ],, ((m 1) 1,1]. Then, m f 1 (( 1) 1 m, 1 m ] = (( 1)2 1 m 2, 1 m ] and s m I = ( 1) 1 m.

9.2. Nonnegatve measurable functons and random varables 189 Remann Integral Lebesgue Integral Fgure 9.1. Illustraton of the approxmatons used to compute the Remann ntegral (left) and Lebesgue ntegral (rght). The Remann ntegral begns wth a partton of the nput doman for the functon. The Lebesgue ntegral begns wth a partton of the output range of the functon. On the rght, the subntervals of the nput doman assocated wth the same subnterval on the output are shaded wth the same color. Hence, s m dµ = m ( 1) 1 m 2 =1 ( 1)2 1 = m2 m 2 m 3 Passng to the lmt as m, gves xdµ = 2/3. m =1 (2 + 1) = 1 6 (m 1)(4m + 1) m 2. Proof. Frst note that s dµ s a monotone ncreasng sequence of real numbers, hence there s an L [0, ] such that s dµ L. We show that L = f dµ. Snce s f for all, L f dµ. So, we show that L f dµ. Ths follows from the defnton f we show that s dµ L for all nonnegatve smple functons s wth s f. Let s be a nonnegatve smple functon wth s f and assume s = a j χ j, j =1 where { j } s a sequence of nonntersectng measurable sets wth =. If we prove that for each j, lm s dµ s dµ = a j µ( j ), (9.6) j j then summng gves L = lm s dµ = lm j =1 j s dµ j =1 j s dµ = s dµ. We fx 1 j k. If a j = 0, then (9.6) follows, so we assume a j > 0. On j, s f pontwse, hence for any x j, s(x) s (x) f for all suffcently large. Choose an ε > 0 wth ε < a j. We defne a sequence of measurable sets {B } va, B = {x j : s (x) a j ε}.

190 Chapter 9. Integraton and Expectaton {B } s a monotone sequence,.e. B B +1, and B j. Moreover, µ(b ) µ( j ). Then, s dµ = j s dµ + B s dµ ( j )\B s dµ B (a j ε) dµ = (a j ε)µ(b ). B Takng the lmt (notng that the quanttes on both sdes are monotone n ), lm s dµ (a j ε)µ( j ), j and snce ε s arbtrary, (9.6) s establshed. The ntegral of nonnegatve measurable functons nherts the propertes of ntegraton for smple functons: Theorem 9.2.3 Integraton for nonnegatve extended real valued measurable functons satsfes the followng: 1. If s a measurable set and f s a nonnegatve extended real valued measurable functon, then f dµ 0. 2. If f s a nonnegatve extended real valued measurable functon and s a measurable set, f dµ sup f µ(). 3. If s a measurable set and the nonnegatve extended real valued measurable functon f satsfes f = 0 a.e. on, then f dµ = 0. 4. If f 1, f 2 are nonnegatve extended real valued measurable functons, c 1, c 2 +, and s a measurable set, (c 1 f 1 + c 2 f 2 ) dµ = c 1 f 1 dµ + c 2 f 2 dµ. 5. If s a measurable set and the nonnegatve extended real valued measurable functons f 1, f 2 satsfy f 1 = f 2 a.e.on, then f 1 dµ = f 2 dµ.

9.2. Nonnegatve measurable functons and random varables 191 6. If s a measurable set and the nonnegatve extended real valued measurable functons f 1, f 2 satsfy f 1 f 2 for all x, f 1 dµ f 2 dµ. 7. If and B are dsjont measurable sets and f s a nonnegatve extended real valued functon, f dµ = f dµ + f dµ. B B 8. If B are measurable sets and f s a nonnegatve extended real valued functon, f dµ f dµ. B Proof. Ether the orgnal Defnton 9.2.2 or Theorem 9.2.2 may be useful for provng propertes of the ntegral. Result 1 The ntegrals of the smple functons n (9.5) are all nonnegatve by Theorem 9.1.2. Result 2 Ths follows from f dµ = sup sup = sup s dµ : nonnegatve smple functons s wth s(x) f (x) all x f µ(), snce sup f s a constant functon on. s d µ : nonnegatve smple functons s wth s(x) sup f all x Result 3 Ths follows from the observaton that we can restrct the smple functons n (9.5) to smple functons that are zero a.e. Result 4 Frst note that c 1 f 1 + c 2 f 2 s a nonnegatve extended real valued measurable functon, so ts ntegral s defned. Next, consder c 1 f 1. c 1 f 1 dµ = sup = c 1 sup = c 1 f dµ. s dµ : nonnegatve smple functons s wth s(x) c 1 f 1 (x) all x s dµ : nonnegatve smple functons s wth s (x) f c 1 c 1 c 1 (x) all x 1 Hence, t suffces to consder c 1 = c 2 = 1. There are monotone sequences of nonnegatve smple functons {s 1, } and {s 2, } wth 0 s l,1 s l,2 f l and s l, f l pontwse, for l = 1,2. Moreover, {s 1, + s 2, } s a monotone sequence of nonnegatve smple functons wth 0 s 1,1 + s 2,1 s 1,2 + s 2,2

192 Chapter 9. Integraton and Expectaton f 1 + f 2 and s 1, + s 2, f 1 + f 2. Fnally, Theorem 9.2.2 mples, f 1 dµ + f 2 dµ = lm s 1, dµ + lm s 2, dµ = lm (s 1, + s 2, ) dµ = ( f 1 + f 2 ) dµ. Result 5 Ths follows by applyng Result 3 and Result 4 to f 1 f 2. Result 6 Ths follows by applyng Result 1 and Result 4 to f 2 f 1. Result 7 Choose a monotone sequence of nonnegatve functons {s } wth s f pontwse. By Theorem 9.2.2 and Theorem 9.1.2, f dµ = lm s dµ = lm s dµ + s dµ B B B = lm s dµ + lm s dµ = f dµ + f dµ. Result 8 By Results 7 and 1, f dµ = B B f dµ + f dµ f dµ. B\ B Note on the proof: It s a good dea to study ths proof carefully. Note that ths mples Theorem 9.2.4 Integraton s a postve lnear operator on the set of nonnegatve ntegrable functons. The correspondng propertes of expectaton are: Theorem 9.2.5 1. If X s a nonnegatve r.v. then E(X ) 0. 2. If X s a nonnegatve r.v.then E(X ) sup Ω X. 3. If X s a nonnegatve r.v.such that X = 0 a.s., E(X ) = 0. 4. If X 1, X 2 are nonnegatve r.v.and c 1, c 2 +, E(c 1 X 1 + c 2 X 2 ) = E(c 1 X 1 ) + E(c 2 X 2 ). 5. If the nonnegatve r.v. X 1, X 2 satsfy X 1 = X 2 a.s., then E(X 1 ) = E(X 2 ). 6. If the nonnegatve r.v. X 1, X 2 satsfy X 1 X 2 a.s., E(X 1 ) E(X 2 ). Example 9.2.3 Set = and choose the σ- algebra. For the measure, let { p } be a sequence =1

9.2. Nonnegatve measurable functons and random varables 193 of nonnegatve numbers wth =1 p = 1 and set P() = p. Then, any functon X : + s a nonnegatve r.v.and E(X ) = =1 X () p. Example 9.2.4 We compute the expectaton of the r.v. X = x 2 on (,, P) wth P equal to the normal dstrbuton wth varance σ 2 = 1 and mean µ = 0 (Example 7.2.5). The expectaton s well defned. Moreover, t s fnte because x q exp( c x 2 ) 0 as x for any q > 0 and c > 0. Workng from the defnton s a very tedous computaton. We compute an approxmaton usng the Monte Carlo method usng pseudo random numbers drawn from the normal dstrbuton n numbers rangng from 2 6 to 2 23 and plot the results n Fgure 9.2. The values appear to converge to 1. 1.15 pproxmate expectaton 1.1 1.05 1 0.95 0.9 4 6 8 10 12 14 16 log(number of Samples) Fgure 9.2. Values of a Monte Carlo approxmaton to the expectaton n Example 9.2.4. Example 9.2.5 We compute the expected value of the r.v.equal to the length of chords on the unt crcle usng the frst probablty model descrbed n Example 7.1.3. In that model, we map the chords of the crcle to a parallelepped R 1 wth area 4π 2 and we normalze Lebesgue measure to get the unform probablty measure. The ntal and end ponts of a chord from x 1 to x 2 are (cos(x 1 ), sn(x 1 )) and (cos(x 2 ), sn(x 2 )) respectvely, see Fgure 9.3. We defne the r.v. Y (x 1, x 2 ) = (cos(x 2 ) cos(x 1 )) 2 + (sn(x 2 ) sn(x 1 )) 2 x2 x = 2sn 1. = 2 sn x 2 x 1 2 The range of Y s [0,2] and the doman s the parallelepped R 1. We set h = 2/m and partton [0,2] [0, h] (h,2h] (( 1)h, h] (2 h,2]. Y 1 s monotone 2

194 Chapter 9. Integraton and Expectaton ncreasng, see Fgure 9.3. Thus, Y 1 (( 1)h, h] ( 1)h h = (x 1, x 2 ) : 2sn 1 < x 2 2 x 1 2sn 1. 2 We plot a correspondng partton of R 1 n Fgure 9.4. We plot randomly selected chords correspondng to ntervals wth h 1.2 and h 1.9 n Fgure 9.5. It follows that P Y 1 (( 1)h, h] = 2 π sn 1 h 2 sn 1 ( 1)h 2. We let S denote the smple functon approxmaton of Y correspondng to the partton of [0, 2]. Therefore, E(S) = m =1 ( 1)h 2 π sn 1 h 2 sn 1 ( 1)h. 2 Settng x 1 = ( 1)h, and multplyng the rghthand sde by h/h, we obtan E(S) = 2 π m =1 sn 1 x +h 1 2 x 1 sn 1 x 1 2 h. h We see that sn 1 x 1 +h 2 sn 1 x 1 d 2 /h d x sn 1 (x/2) at x = ( 1)h as h 0. So, n the lmt of m and h 0, ths sum converges to the Lebesgue ntegral, E(Y ) = 1 x π 1 x2 /2 dµ 2 (x). [0,2] We evaluate ths ntegral as a Remann ntegral to fnd that E(Y ) = 4/π. π x 1 x 2 sn(x 2 ) cos(x 2 ) cos(x 1 ) sn(x 1 ) -2 2 Fgure 9.3. Left: Defnng the r.v. Y for Example 9.2.5. Rght: Plot of Y 1. π 9.3 General measurable functons and random varables We extend ntegraton to general measurable functons usng the decomposton of a measurable functon f nto ts postve f + and negatve f parts descrbed n Defnton 8.2.5 and Theorem 8.2.12 and applyng the defnton for nonnegatve functons to each part.

9.3. General measurable functons and random varables 195 2π π π 2π Fgure 9.4. partton of R 1 correspondng to a partton of the range of Y. Fgure 9.5. Randomly selected chords correspondng to ntervals wth h.6 (left), h 1.2 (center) and h 1.9 (rght). Defnton 9.3.1 Let f be an extended real valued measurable functon and a measurable set. If f + dµ and f dµ are not both nfnte, then f dµ = f + dµ f dµ. We say that the ntegral of f s defned (on ). If f + dµ and f dµ are both nfnte, we say that the ntegral s not defned on ). If f + dµ and f dµ are both fnte, then we say that f s (Lebesgue) ntegrable (on ). If =, we say that f s (Lebesgue) ntegrable and wrte f L 1 (,,µ) = L 1.

196 Chapter 9. Integraton and Expectaton If X s an extended real valued random varable and E(X + ) and E(X ) are not both nfnte, then the expected value, mean, expectaton, or frst moment of X s E(X ) = E(X + ) E(X ). The pont of ths defnton s that we avod. lso note that a functon whose ntegral s defned s necessarly measurable. We note that f ether the measure or the σ- algebra are changed, then ntegrablty s lkely to change as well. Example 9.3.1 Consder (,,µ ). Then, x dµ, sn(x) dµ, and 1 dµ are all undefned. x Example 9.3.2 Consder x d P on (,, P) where P s the normal dstrbuton. Both x + d P and x d P are fnte because x q exp( c x 2 ) 0 as x for any q > 0 and c > 0. By symmetry, x d P = 0. We compute an approxmaton usng the Monte Carlo method usng pseudo random numbers drawn from the normal dstrbuton n numbers rangng from 2 6 to 2 53 and plot the results n Fgure 9.6. The values appear to converge to 0. -3-4 log(pproxmate ntegral) -5-6 -7-8 -9-10 -11 4 6 8 10 12 14 16 18 log(number of Samples) Fgure 9.6. Values of a Monte Carlo approxmaton to the ntegral n Example 9.3.2. We next state the basc propertes of ntegraton for general measurable functons. The results are famlar, but the assumptons about the ntegrablty of the functons nvolved are crtcal for ther valdty. In these statements, we use the conventon that 0 = 1 1 0 = 0. Ths s not a statement about lmts, e.g. lm n n lm n 2 n n n 1 lm n n n2. But, ths conventon s fne n the context of the results below.

9.3. General measurable functons and random varables 197 Theorem 9.3.1: Propertes of the Lebesgue Integral Integraton for measurable functons satsfes the followng propertes: 1. If the ntegral of the extended real valued measurable functon f s defned on, then the ntegral of f s defned on any measurable set n. 2. If the ntegral of f s defned on the measurable set, f dµ f dµ sup f µ(). 3. If the ntegral of the extended real valued measurable functon f s not defned, then f dµ =. 4. If s a measurable set and the extended real valued measurable functon f satsfes f = c a.e. on for some constant c, then f dµ = cµ(). 5. If the ntegrals of the extended real valued measurable functons f 1, f 2 are defned, c 1, c 2, and the ntegral of c 1 f 1 + c 2 f 2 s defned, s a measurable set, and c 1 f 1 dµ + c 2 f 2 dµ s not of the form or +, then (c 1 f 1 + c 2 f 2 ) dµ = c 1 f 1 dµ + c 2 f 2 dµ. 6. If s a measurable set and the extended real valued measurable functons f 1, f 2 satsfy f 1 = f 2 a.e.on, then ether the ntegrals of both f 1 and f 2 are defned on and f 1 dµ = f 2 dµ, or nether of the ntegrals of f 1 or f 2 are defned on. 7. If s a measurable set, the extended real valued measurable functons f 1, f 2 satsfy f 1 f 2 for all x, and ether the ntegral of f 1 s defned and f 1 dµ or the ntegral of f 2 s defned and f 2 dµ, then the ntegral of the other of the functons f 1, f 2 s defned, and f 1 dµ f 2 dµ. 8. If s a measurable set and the ntegrals of the extended real valued measurable functons f 1, f 2 are defned on and satsfy f 1 dµ = f 2 dµ and f 1 f 2 a.e.on, then f 1 = f 2 a.e.on. 9. If f s a extended real valued measurable functon, then f dµ = 0 f = 0 a.e.

198 Chapter 9. Integraton and Expectaton 10. If s a measurable set and f 1, f 2 are extended real valued measurable functons on, then f 1 + f 2 dµ f 1 dµ + f 2 dµ. 11. If and B are dsjont measurable sets and the ntegral of the extended real valued measurable functon f s defned on B, f dµ = f dµ + f dµ. B B 12. If f s an extended real valued ntegrable functon, then f s fnte a.e. Proof. The proofs of these results follow from Theorem 9.2.3, sometmes after checkng varous cases of for the ntegrals nvolved. Result 1 Ths follows from the observaton that for any measurable, f ± dµ f ± dµ. Result 2 Ths follows from the observatons, f + dµ f dµ f + dµ + f dµ, and sup f + + sup f = sup. Result 3 Ths follows from the defnton. Result 4 f s a smple functon a.e. Result 5 Ths s typcal of the extenson of a result about nonnegatve functons n Theorem 9.2.3 to the general case. Frst, t s easy to see that we can assume c 1 = c 2 = 1 wthout loss of generalty (see the proof of Theorem 9.2.3). We use the equalty, ( f 1 + f 2 ) + + f 1 + f 2 = ( f 1 + f 2 ) + f + 1 + f + 2. Each sde s a sum on nonnegatve measurable functons, we have ( f 1 + f 2 ) + dµ + f 1 dµ + f 2 dµ = ( f 1 + f 2 ) dµ + If all the terms are fnte, we can rearrange them to get ( f 1 + f 2 ) + dµ ( f 1 + f 2 ) dµ = f + 1 dµ f 1 + dµ f + 1 dµ + f + 2 dµ f + 2 dµ. f 2 dµ. Ths shows the result. Then, we have to verfy what happens when varous terms are nfnte. Results 6 and 7 Exercse. Result 8 We have f 2 f 1 0 on a.e., so f 2 f 1 s ntegrable. We apply Result 5 to f 2 = f 1 + ( f 2 f 1 ) to get f 2 dµ = f 1 dµ + ( f 2 f 1 ) dµ. Snce f 1 and f 2 are ntegrable, we subtract the frst term on the rght from both sdes to conclude that ( f 2 f 1 ) dµ = 0.

9.3. General measurable functons and random varables 199 For ε > 0, defne the smple functon s(x) = 0, f2 (x) f 1 (x) < ε, ε, f 2 (x) f 1 (x) ε. It s an exercse to show that s dµ = 0. Ths mples that µ({x : f 2 (x) f 1 (x) ε}) = 0. We let ε 0 through a sequence and use contnuty of measure to conclude µ({x : f 2 (x) f 1 (x) > 0}) = 0. Results 9, 10 and 11 Exercse. Result 12 If f s ntegrable and µ {x : f (x) = } > 0, then f dµ = by Result 11, whch s a contradcton. The random varable verson s Theorem 9.3.2: Propertes of Expectaton 1. If X s an extended real valued r.v.whose expectaton s defned, E(X ) E( X ) sup X. 2. If X s an extended real valued r.v.whose expectaton s not defned, E( X ) =. 3. If X s an extended real valued r.v. such that X = c a.s., E(X ) = c. 4. If X 1, X 2 are r.v.whose expectaton are defned, c 1, c 2 and the expectaton of c 1 X 1 + c 2 X 2 s defned, and c 1 E(X 1 ) + c 2 E(X 2 ) s not of the form or +, then E(c 1 X 1 + c 2 X 2 ) = c 1 E(X 1 ) + c 2 E(X 2 ). 5. If the extended real valued r.v.s X 1, X 2 satsfy X 1 = X 2 a.s., then ether the expectatons of X 1 and X 2 are both defned and E(X 1 ) = E(X 2 ) or the expectatons of both are not defned. 6. If the extended real valued r.v.s X 1, X 2 satsfy X 1 X 2 a.s., and ether the expectaton of X 1 s defned and E(X 1 ) or the expectaton of X 2 s defned and E(X 2 ), then the other expectaton of X 1, X 2 s defned and E(X 1 ) E(X 2 ). 7. If the extended real valued r.v.s X 1, X 2 are defned and E(X 1 ) = E(X 2 ), and X 1 X 2 a.s., then X 1 = X 2 a.s. 8. If X s an extended real valued r.v., then E( X ) = 0 X = 0 a.s. 9. If X 1, X 2 are extended real valued r.v., then E( X 1 + X 2 ) E( X 1 ) + E( X 2 ). 10. If X s an extended real valued r.v.whose expectaton s defned, then X s fnte a.s. Remark 9.3.1 It s apparently dffcult to compute Lebesgue ntegrals usng the defnton. In Secton 9.6, we show that Lebesgue ntegrals can be computed as Remann ntegrals n many stuatons. The rough rule of thumb s that f the Remann ntegral s defned, t s equal to the Lebesgue ntegral. To make t easer to gve examples, we use ths fact from now on.

200 Chapter 9. Integraton and Expectaton 9.4 Propertes of ntegraton and expectaton We next develop more propertes of ntegraton wth respect to varous lmts, begnnng wth the case of ntegraton of nonnegatve measurable functons and random varables. Result 7 of Theorem 9.2.3 ndcates that the ntegral f dµ of a nonnegatve measurable functon s fntely addtve consdered as a set functon of the doman. The followng theorem states that ths set functon s n fact countably addtve. Theorem 9.4.1 Let f be a nonnegatve extended real valued measurable functon and { } a dsjont sequence of measurable sets wth =. Then, f dµ = =1 f dµ. (9.7) Proof. ssume that s = k j =1 b j χ B s a nonnegatve smple functon, where {B j j } s a dsjont collecton of measurable sets wth = j B j. Then, s dµ = b j µ( B j ) = j =1 = b j j =1 =1 =1 j =1 µ( B j ) b j µ( B j ) = s dµ. For ε > 0, choose a nonnegatve smple functon s wth s f and f dµ s dµ + ε. Now, f dµ s dµ + ε = s dµ + ε f dµ + ε. Snce ε s arbtrary, =1 f dµ =1 f dµ. To prove the reverse nequalty, we use nducton. For ε > 0, let s 1, s 2 be nonnegatve smple functons wth s k f and k s k dµ k f dµ ε for k = 1,2. Let s = max s 1, s 2. s s a nonnegatve smple functon wth s k s f for k = 1,2. Thus, f dµ 1 2 s dµ = 1 2 s dµ + 1 s dµ 2 f dµ + 1 f dµ 2ε. 2 Snce ε s arbtrary, f dµ 1 2 s dµ 1 2 f dµ + 1 f dµ, 2 =1 =1 and by nducton, f dµ f dµ. k =1 =1

9.4. Propertes of ntegraton and expectaton 201 Snce k =1, f dµ f dµ k =1 Snce k s arbtrary, we have proved the result. =1 f dµ. Note on the proof: Ths result uses the nonnegatvty of the functon f at several ponts. It contans some useful deas on how to treat a condton defned by a sup over a class of functons. Example 9.4.1 We ntegrate the sawtooth functon shown n Fgure 9.7 on ((0,1], (0,1],µ ). Ths s a pecewse lnear functon f nterpolatng the ponts (1, 0), (3/4, 1), (1/2, 0), (3/8, 1), (1/4,0), (3/16,1), (1/8,0),. Note that lm x 0 f (x) s undefned. Theorem 9.4.1 mples that, f dµ = f dµ + f dµ + f dµ + (1/2,1] (1/4,1/2] (1/8,1/4] Ths s a case n whch the Remann and Lebesgue ntegrals on the subntervals agree. Thus, f dµ = 1 1 2 1 1 2 11 1 2 2 1 4 Summng the seres yelds f dµ = 1/6. 11 + 1 2 4 1 11 1 8 2 8 1 16 + 1 1 1 2 16 1 1 1 1 32 2 32 1 64 + 1.25.375.5.75 1-1 Fgure 9.7. Plot of the sawtooth functon for Example 9.4.1.

202 Chapter 9. Integraton and Expectaton Example 9.4.2 We compute e [x] dµ on ( +,b +,µ ). Usng Theorem 9.4.1, we wrte e [x] dµ = =1 ( 1,] e [x] dµ = e ( 1) 1 = e e 1. =1 It s useful to combne ths result wth the followng theorem, whch s an mmedate consequence of Theorem 9.2.3. Theorem 9.4.2 If f s a nonnegatve extended real valued measurable functon and has µ() = 0, then f dµ = 0. Example 9.4.3 On (,b,µ ), dµ = 0. Indeed, Theorems 9.4.1 and 9.4.2 say that for a fxed nonnegatve measurable functon f, ntegraton consdered as a set functon of the doman of ntegraton s a measure! Theorem 9.4.3: Measure Induced by a Nonnegatve Measurable Functon If f s a nonnegatve extended real valued measurable functon, then µ f () = f dµ, defnes a measure on (, ). If f dµ = 1, then µ f = P f s a probablty measure. Defnton 9.4.1: Densty functon f s called the densty functon for the measure µ f. In the case that µ f = P f n Theorem 9.4.2 s a probablty measure, f s called a probablty densty for P f. Ths means that we can use ntegraton of nonnegatve measurable functons to defne many new measures on the orgnal measure structure. Note that ths approach to defnng a measure requres a measure space, not just a measurable space. Example 9.4.4 On ((0,1], (0,1],µ ), defne µ() = x dµ. Then, µ((a, b]) = 1 2 (b 2 a 2 ). We can make ths a probablty densty by usng the measure 2µ nstead of µ.

9.4. Propertes of ntegraton and expectaton 203 Example 9.4.5: Cauchy densty The Cauchy probablty measure s defned on the Lebesgue measurable sets on by P() = f dµ,, where 1 f (x) = f (x; x 0,γ) = πγ 1 + x x 2, 0 γ and constants x 0 and γ. It s fnte for all and P G () = 1. x 0 s the locaton parameter and γ s the scale parameter. We plot an example n Fgure 9.8. Example 9.4.6: Normal densty The Gaussan probablty measure or normal probablty measure s defned on the Lebesgue measurable sets of by P() = f dµ,, where f (x) = f (x;µ,σ) = 1 2π σ e (x µ)2 /2σ 2, and real constant µ and nonnegatve constant σ. It s fnte for all and P() = 1. µ s called the mean and σ 2 s called the varance. We plot an example n Fgure 9.8. In Example 7.2.4 and Example 7.2.5, we defne probablty measures by gvng ther respectve cdf. Here, we defne measures wth the same names by specfyng densty functons. We have not shown that these two approaches defne the same probablty measures. In general, we have establshed a 1 1 correspondence between dstrbuton functons and Lebesgue-Steljes measures. natural queston s whether or not every Lebesgue-Steljes measure can be wrtten n terms of ntegral wth some nonnegatve measurable functon. It turns out that not all of them can. We dscuss ths further n Chapter 12.3.1. Example 9.4.7: Multvarate normal densty On ( n, n,µ ), let Σ be a postve defnte n n matrx (v Σv > 0 for all nonzero vectors) and µ a vector n n. The multvarate normal dstrbuton or multvarate Gaussan dstrbuton s the probablty measure assocated wth densty 1 f (x) = (2π)n det(σ) exp 1 2 (x µ) Σ 1 (x µ), (9.8) where x = (x 1,, x n ). The measure s computed va the multdmensonal ntegral

204 Chapter 9. Integraton and Expectaton 0.4 0.35 0.16 0.35 0.3 0.3 0.25 0.25 0.2 0.2 0.15 0.15 0.1 0-10 0 0.1 0.05 5 5 0.05-8 -6-4 -2 0 2 4 6 8 10 0-5 -4-3 -2-1 0 1 2 3 4 5-5 -5 Fgure 9.8. Plots of probablty denstes. Rght: Cauchy wth x0 = 0 and γ = 1. Mddle: Normal wth µ = 0 and σ 2 = 1. Left: 2-dmensonal normal wth Σ = I and µ = (0, 0)>. P () = R f (x) d µl (x). We plot an example n Fgure 9.8. We dscuss the computaton of Lebesgue-Steljes measures n several dmensons usng dstrbuton functons n Secton 6.4. It s much easer to compute measures n several dmensons usng densty functons. Theorem 9.4.3 mples that ntegraton nherts propertes of a measure. n example s the followng contnuty result whch s mpled by Theorem 5.2.3, Theorem 9.4.4 ssume that f s a nonnegatve extended real valued measurable functon. S 1. If { } M satsfes 1 2 and =, then Z f d µ = lm Z f d µ. 2. If { } M satsfes 1 2, µ(1 ) < and = Z f d µ = lm Z T, then f d µ. last result follows from the defntons, but t s useful enough to be stated as a theorem. Theorem 9.4.5 Let f and g be measurable functons wth g nonnegatve. Then for all BX, Z Z f d µg = f g d µ. The next result s concerned wth a sequence of nonnegatve functons, and t s one of the most fundamental propertes of the Lebesgue ntegral wth respect to lmts. In fact, t s so mportant, t has ts own abbrevaton. It apples to sequences that are monotone, and the monotoncty s essental.

9.4. Propertes of ntegraton and expectaton 205 Example 9.4.8 Consder the nonnegatve smple functons on (,,µ ),, 0 < x < 1/, s (x) = 0, otherwse, 2, 0 < x < 1/, w (x) = 0, otherwse. Then, s 0 and w 0 pontwse, but s dµ = 1 and w dµ = for all. These sequences are not monotone of course. Theorem 9.4.6: Monotone Convergence Theorem (MCT) If { f } s a sequence of nonnegatve extended real valued measurable functons wth f f +1 a.e.for all and f = lm f (= sup f ) a.e., then f dµ = lm f dµ, for any measurable set. Let {X } be a monotone ncreasng sequence of nonnegatve extended real valued random varables and set X = lm X. Then, E(X ) E(X ). There s a general mathematcal ssue relevant here. Namely, n a stuaton n whch t s reasonable to take the lmts n more than one varable, t s mportant to nvestgate f the order n whch lmts are taken s mportant. Snce ntegraton s obtaned as a lmt, mxng ntegraton wth any other lmtng process rases ths queston. Ths theorem gves general condtons under whch a lmt and ntegraton can be exchanged. Recall that ths exchange does not hold for Remann ntegraton n general. Proof. We assume the condtons hold pontwse and leave the extenson to the a.e.case as an exercse. We also note that f the result holds for ntegrals over and f s measurable, then { f χ } s a monotone sequence of nonnegatve measurable functons that converges to f χ, so the theorem can be appled to the ntegraton over. The proof s a varaton of the argument used to prove Theorem 9.2.2. f dµ s a monotone ncreasng sequence of numbers, so has a lmt, whch may be. Moreover, the monotoncty of { f } guarantees that f s defned and nonnegatve whle Theorem 8.2.11 guarantees f s measurable so that f dµ s defned. Fnally, f f for all, so f dµ f dµ for all, so lm f dµ f dµ. We prove the reverse nequalty. Fx 0 < α < 1, and let s be a smple functon wth 0 s f. Defne = {x : f (x) αs(x)}. { } s a monotone ncreasng sequence of measurable sets ( +1 ) such that =. We have f dµ f dµ α s dµ, for all.

206 Chapter 9. Integraton and Expectaton By Theorem 9.4.3 and the contnuty of measures, lm s dµ = s dµ, so lm f dµ α s dµ. Snce α s arbtrary, lm f dµ s dµ. Takng the sup overall smple functons s f gves the result. We shall see that the MCT s the key pont n the proofs of a number of other mportant results. We prove some relatvely elementary results to gve the dea. The frst example s very remnscent of the defnton of the mproper Remann ntegral on [0, ). Theorem 9.4.7 Let f be a nonnegatve extended real valued measurable functon. Then, f dµ = lm f dµ. [0,) [0, ) Proof. Set f = f I [0,) for = 1,2,. Then, { f } s a monotone ncreasng sequence of nonnegatve measurable functons wth lmt f and Theorem 9.4.6 mples, [0, ) f dµ = lm f dµ = lm f dµ. [0, ) [0,) nother smple but useful result: Theorem 9.4.8 If { f } s a sequence of nonnegatve extended real valued measurable functons and f = =1 f, then f s a nonnegatve extended real valued measurable functon and for any measurable set, f dµ = =1 f dµ. If {X } be a sequence of nonnegatve extended real valued random varables, then E X = E(X ).

9.4. Propertes of ntegraton and expectaton 207 Proof. The sequence of partal sums {F j } j =1, wth F j = j =1 f, defnes an ncreasng sequence of nonnegatve extended real valued measurable functons, whch has a lmt f. The MCT mples the result. nother example s the followng change of varables formula. Theorem 9.4.9: Change of Varables Let (,,µ) be a measure space and (, ) be a measurable space and assume that g s a (, )-measurable functon. If f : s a nonnegatve -measurable functon, then f g(x) dµ(x) = f (y) dµ g (y), (9.9) g 1 () for all, where f one of the ntegrals n (9.9) exsts then so does the other. general measurable functon f s ntegrable wth respect to µ g f and only f f g s ntegrable wth respect to µ, n whch case (9.9) holds. Proof. The exstence clam follows for nonnegatve measurable functons because there s no change n sgn n (9.9). Suppose that f = χ for some. Then, So, ( f g)(x) = f g dµ = µ g 1 ()) and 0, g(x) /, 1, g(x), = χ g 1 (). f dµ g = µ g () = µ g 1 ()). Ths shows the result for a characterstc functon of a measurable set. By lnearty, ths mples the result holds for nonnegatve smple functons. In the case of a nonnegatve measurable functon f, we construct a monotone sequence of smple functons {s } such that s f pontwse. We have, s g dµ = s dµ g, for all. We pass to the lmt usng the MCT to get the desred result. For a general measurable functon f, we apply the frst result to the nonnegatve measurable functon f to conclude f s ntegrable wth respect to µ g f and only f f g s measurable wth respect to µ. Then we apply the frst result to f + and f separately and subtract the results to obtan (9.9). Note on the proof: Ths proof follows a useful pattern: frst establsh the result for nonnegatve smple functons, then pass to nonnegatve measurable functons usng approxmaton by sequences of nonnegatve smple functons and the MCT. Then, extend to general functons usng the decomposton nto postve and negatve parts.

208 Chapter 9. Integraton and Expectaton Example 9.4.9 We use Theorem 9.4.9 to compute (0,1] sn(π/x) dµ (9.10) on ((0,1], (0,1],µ ). We plot the ntegrand n Fgure 9.9. We set g(x) = 1/x, whch s measurable on (0,1], and whch maps ((0,1], (0,1] ) to ([1, ), [1, ) ). Correspondngly, we set f (y) = sn(πy), whch s measurable on [1, ). Then, (0,1] = g 1 ([1, )) and sn(π/x) dµ = sn(πy) dµ g (y), [1, ) (0,1] where µ g (B) = µ (g 1 (B)) for B [1, ) and g 1 (B) (0,1]. Dealng wth µ g s annoyng at best. We wrte ths as a densty wth respect to µ (y). Snce g 1 ([a, b)) = 1 b, 1 a, µg ([a, b)) = 1 a 1 b = b a, for 1 a b. Thus, ab we seek a densty functon h such that [a,b) h(y) dµ (y) = b a ab. Ths s h(y) = 1/y 2, snce 1 dµ [a,b) y 2 (y) = 1 a 1. Note, we are usng the soon-tobe-proved fact that we can compute the ntegral as a Remann ntegral. Thus, b sn(πy) sn(π/x) dµ = dµ (0,1] [1, ) y 2 (y), (9.11) The ntegrand s not postve, hence we have to check the ntegral on the rghthand sde s defned. However, sn(πy) + dµ (y) [1, ) = y 2 [2,3) sn(πy) sn(πy) sn(πy) dµ y 2 (y) + dµ [4,5) y 2 (y) + dµ [6,7) y 2 (y) + 1 2 + 1 2 4 + 1 <. 2 62 Lkewse, sn(πy) y also has a fnte ntegral, so the ntegral on the rghthand sde 2 of (9.9) s defned and fnte. (The reader s nvted to try to prove (9.10) s defned drectly!). The ntegral s defned, but not easy to compute. We approxmate ts value usng a sequence of Monte Carlo computatons usng 2 6 to 2 23 ponts. These seem to be convergng to approxmately 0.0115, see Fgure 9.9. Example 9.4.8 llustrates why monotoncty s essental for the MCT. On the other

9.4. Propertes of ntegraton and expectaton 209 1 0.8 0.6 0.4 0.2 0-0.2-0.4-0.6-0.8-1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 x pproxmate expectaton 0.005 0-0.005-0.01-0.015-0.02-0.025-0.03 4 6 8 10 12 14 16 log(number of Samples) Fgure 9.9. Left: Plot of sn(π/x). Rght: Results of Monte Carlo approxmatons of (9.10) over (0,1] computed usng (9.9). hand, we can prove a weaker result that says the ntegral of a lmt can never be larger than the lmt of the ntegrals. Ths s another result that should be called a theorem because of ts central mportance. Theorem 9.4.10: Fatou s Lemma If { f } s a sequence of nonnegatve extended real valued measurable functons, then (lm nf f ) dµ lmnf f dµ. (9.12) If {X } be a sequence of nonnegatve extended real valued random varables, then E lmnf X lmnf E(X ). Proof. For each k, nf k f f j for j k. Hence, nf f dµ f j dµ, j k. k We use the MCT to apply the lmt k, (lm nf f ) dµ = lm k (nf k f ) dµ lmnf f dµ. We next dscuss a fundamental result for sequences of general measurable functons. Usng an old dea n analyss, we replace the assumpton of monotoncty n the MCT by a domnatng functon. Theorem 9.4.11: Domnated Convergence Theorem (DCT) Let { f } be a sequence of extended real valued measurable functons and assume there s a nonnegatve measurable functon g such that f g a.e.for each and g dµ <

210 Chapter 9. Integraton and Expectaton. Then, < lmnf f dµ lmnf lmsup f dµ lmsup f dµ f dµ <. (9.13) If n addton, f = lm f exsts a.e., then f dµ <, lm f dµ = f dµ, lm f f dµ = 0. (9.14) Let {X } be a sequence of extended real valued r.v.such that X X a.s.and suppose Y s a nonnegatve r.v.wth X Y for all ω Ω and and E(Y ) <. Then, lm E(X ) = E(X ). Proof. Snce f g a.e., lmnf f g a.e.and lmsup f g a.e.hence, all of the f, lmnf f, and lmsup f are ntegrable. We apply Fatou s Lemma 9.4.10 to the sequences of nonnegatve measurable functons { f lmnf f } and {lmsup f f } to obtan (9.13). For example, 0 = lm nf f lmnf f ) dµ lmnf f lmnf f dµ = lmnf f dµ lm nf f dµ. Snce lmnf f dµ <, we can subtract t from both sdes. Under the addton assumpton that f exsts, f g a.e.as well, so f s ntegrable, provng the frst result n (9.14). Then, lmnf f = lmsup f = f a.e., so lmnf f dµ = lmsup f dµ, and the second result n (9.14) holds. Fnally, f f 2g a.e., so the thrd result n (9.14) follows. Note on the proof: We use a standard trck of ntroducng sequences of parameter values to evaluate the defntons of contnuty and the dervatve n order to use the DCT. Defnton 9.4.2 If { f } s a sequence of measurable functons for whch there s a measurable functon f such that lm f f dµ = 0, we say that { f } converges to f n L 1. We dscuss the sgnfcance n Chapter??. We present a smplfed verson that s useful n practce. Theorem 9.4.12: Bounded Convergence Theorem (BCT) Let { f } be a sequence of extended real valued measurable functons. ssume that f = lm f exsts a.e., there s a fnte constant M such that f M for all and

9.4. Propertes of ntegraton and expectaton 211 x, and µ() <. Then, lm f dµ = f dµ,. Let {X } be a sequence of extended real valued r.v.ssume X = lm X exsts a.s.suppose there s a fnte constant M such that X M for all. Then, E( X ) M, lm E(X ) = E(X ), lm E( X X ) = 0. The proof s a good exercse for applcaton of the DCT. We use the DCT to prove a theorem wth great practcal mportance. We consder measurable functons that depend on a set of parameters, e.g. the Cauchy densty. The theorem states that f a functon s contnuous or dfferentable wth respect to the parameter, then the ntegral of the functon s contnuous or dfferentable wth respect to the parameter respectvely. In other words, ntegraton does not make the ntegrand less smooth. We state the result for a sngle parameter to keep notaton smple. The generalzaton to a fnte set of parameters s straghtforward. Theorem 9.4.13: Integraton of Functons wth Parameters Suppose that { f ( ; t)} s a famly of real-valued ntegrable measurable functons for each a t b where a and b are fnte. 1. ssume there s a nonnegatve functon g wth g dµ < such that f (x; t) g(x) for all x and a t b. If lm t t 0 f (x; t) = f (x; t 0 ) for every x, then lm f (x; t) dµ(x) = f (x; t t t 0 ) dµ(x). 0 In other words, f f ( ; t) s contnuous at t 0, then so s f (x; t) dµ(x). 2. ssume that f (x; t)/ t exsts for each x and there s a nonnegatve functon g wth g dµ < such that f (x; t)/ t g(x) for all x and a t b. Then, f (x; t) dµ(x) s dfferentable wth respect to t and f f (x; t) dµ(x) = (x; t) dµ(x), a t b. t t Let {X ( ; t)} be a famly of r.v.for a t b, where a and b are fnte. 1. ssume there s a nonnegatve r.v.y wth E(Y ) < such that X (x; t) Y (x) for all x Ω and a t b. If lm t t0 X (x; t) = X (x; t 0 ) for every x Ω, then lm t t0 E(X ( ; t)) = E(X ( ; t 0 )). 2. ssume X (x; t)/ t exsts for each x Ω and there s a nonnegatve r.v. Y wth E(Y ) < such that X (x; t)/ t Y (x) for all x Ω and a t b. Then E(X (x; t)) s dfferentable wth respect to t and E(X (x; t)) = t E( X / t(x; t)) for all x Ω and a t b. Proof.

212 Chapter 9. Integraton and Expectaton Result 1 We use the DCT, so we need to create a sequence of functons. Choose a sequence {t } [a, b] wth t t 0 and defne f (x) = f (x; t ). Now apply the DCT to { f }. Result 2 Defne F (t) = f (x; t) dµ(x) and choose a sequence {t } [a, b] wth t t 0 and defne g (x) = ( f (x; t ) f (x; t 0 ))/(t t 0 ). f / t(x; t 0 ) = lm g (x), so s a measurable functon. By the Mean Value Theorem, g (x) sup a t b f / t(x; t) g(x). We apply the DCT to conclude, F F (t (t 0 ) = lm ) F (t 0 ) = lm t t 0 f g (x) dµ(x) = (x; t) dµ(x). t Example 9.4.10 The Cauchy and normal measures nduced by the Cauchy and normal probablty densty functons are dfferentable wth respect to ther parameters. 9.5 Two mportant nequaltes In ths secton, we prove two fundamental nequaltes for analyss. The frst nequalty s an analog of the Cauchy-Schwarz nequalty for vectors n n,.e. f x, y n, then x y x y, where s the usual dot product and s the usual Eucldean norm. It s mpossble to overstate the mportance of ths theorem. Theorem 9.5.1: Cauchy-Schwarz Inequalty (C-S) ssume f and g are extended real valued measurable functons such that f 2 dµ < and g 2 dµ <. Then, f g s measurable and ntegrable and 2 f g dµ f 2 dµ g 2 dµ. (9.15) Moreover, equalty holds f and only f one of the functons s a constant multple of the other. ssume X and Y are r.v.such that E(X 2 ) < and E(Y 2 ) <. Then X Y s a r.v.and E(X Y ) 2 E(X 2 )E(Y 2 ). Proof. Let E 1 = {x X : f (x) > g(x) } and E 2 = {x X : g(x) > f (x) }. Note that = E 1 E 2, E 1 E 2 =, and µ() = µ(e 1 ) + µ(e 2 ). On E 1, f g f 2 whle on E 2, f g g 2. So, X f g dµ f 2 dµ + E 1 E 2 g 2 dµ X f 2 dµ + X g 2 dµ <.

9.6. Remann and Lebesgue ntegraton 213 Ths shows the frst concluson. If f = c g a.e., then the result follows. Suppose nether f, g s a constant multple of other. For α, (α f g) 2 dµ s defned and s X greater than. Now, (α f g) 2 dµ = α 2 f 2 dµ 2α f g dµ + g 2 dµ, X X X X whch s a quadratc functon n α. The left sde s postve for all α, so the dscrmnant satsfes 2 4 f g dµ 4 f 2 dµ g 2 dµ 0. X X X Recall that the dot product provdes an nner product structure on the vector space n, whch leads to the very useful concept of orthogonalty. Ths s a fnte dmensonal example of the more general concept of a Hlbert space. We can defne a smlar nner product structure on the vector space of functons wth Lebesgue ntegrals whose square powers are ntegrable. We dscuss ths n Chapter??. The next nequalty s smple to prove, but very useful for certan knds of arguments. Theorem 9.5.2: Markov s nequalty Let f be a non-negatve extended real valued measurable functon. For c > 0 and, set c = {x : f (x) c}. Then, µ( c ) 1 1 f dµ = c c µ f ( c ). (9.16) c Proof. Snce f c on c, c cdµ c f dµ. But c cdµ = cµ( c ). 9.6 Remann and Lebesgue ntegraton In ths secton, we explore the relaton between the Lebesgue and Remann ntegrals. Of course, ths s an nterestng subject on ts own. But, a practcal beneft s that we can compute Lebesgue ntegrals usng standard methods for computng Remann ntegrals n many stuatons. s we have seen, computng Lebesgue ntegrals usng the defnton can be dffcult. We develop the connecton between the two ntegrals n. There s a straghtforward extenson to n. We let [a, b] be fxed and let F be a fxed dstrbuton functon on [a, b] assocated wth complete L-S measure µ. So, the measure space s ([a, b], [a,b],µ). We let f be a bounded measurable functon, f M, on [a, b]. We frst recall the Darboux formulaton of the Remann ntegral (see ppendx??).