Chapter 7 Product measure and Fubini s theorem This is based on [Billingsley, Section 18]. 1. Product spaces Suppose (Ω 1, F 1 ) and (Ω 2, F 2 ) are two probability spaces. In a product space Ω = Ω 1 Ω 2, a measurable rectangle is A 1 A 2 where A 1 F 1 and A 2 F 2. We define F 1 F 2 as the σ-field generated by measurable rectangles. Example 7.1. B(R) B(R) = B(R 2 ) Theorem 7.1. (i) If E F 1 F 2 then for each ω 1 Ω 1 the set is in F 2. Similarly, the section E 1 (ω 2 ) F 1. E 2 (ω 1 ) = {ω 2 Ω 2 : (ω 1, ω 2 ) E} (ii) if f : Ω R is measurable, then for each ω 1 Ω 1 the function ω 2 f(ω 1, ω 2 ) is measurable. Proof. Fix ω 1 Ω 1. Consider the mapping T : Ω 2 Ω given by T (ω 2 ) = (ω 1, ω 2 ). If E = A 1 A 2 is a measurable rectangle, then T 1 (E) is either or A 2. So T is measurable, and T 1 (E) F 2 for all E F 1 F 2. Now g(ω 2 ) = f T (ω 2 ) is a measurable mapping as a composition of measurable functions. 2. Product measure Suppose P 1, P 2 are probability measures on F 1 and F 2 respectively. By Theorem 7.1, the function ω 1 P 2 (E 2 (ω 1 )) is well defined. The collection L of subsets E of Ω 1 Ω 2 for which this function is F 1 -measurable is a λ-system. The collection of measurable rectangles E = A 1 A 2 is a π-system, and for such E the function is 83
84 7. Product measure and Fubini s theorem P 2 (E 2 (ω 1 )) = I A1 (ω 1 )P 2 (A 2 ), so it is measurable. Therefore, ω 1 P 2 (E 2 (ω 1 ) is measurable for all E F 1 F 2. Since this is non-negative and bounded function, we can define (7.1) P (E) := P 2 (E 2 (ω 1 )P 1 (dω 1 ) Ω 1 and (7.2) P (E) := P 1 (E 1 (ω 2 )P 2 (dω 2 ) Ω 2 It is clear that both P and P are probability measures. convergence theorem for the integrals.) (Continuity follows from monotone We note that for measurable rectangles P (A 1 A 2 ) = P (A 1 A 2 ) = P 1 (A 1 )P 2 (A 2 ). Since the class of sets E for which P = P is a λ-system, this means that P = P. The common value is the product measure P = P 1 P 2. Note that the above construction gives inductively a product measure on Ω 1 Ω n. In particular, it implies a finite version of Theorem 4.7. Theorem 7.2. If F 1, F 2,..., F n are cumulative distribution functions then there exists a probability space (Ω, F, P ) and a sequence X 1, X 2,..., X n of independent random variables such that X k has cumulative distribution function F k. 3. Fubini s Theorem Theorem 7.3. If f : Ω = Ω 1 Ω 2 R is non-negative then the functions ω 2 Ω 1 f(ω 1, ω 2 )P 1 (dω 1 ) and ω 1 Ω 2 f(ω 1, ω 2 )P 2 (dω 2 ) are measurable and (7.3) ( ) ( ) f(ω 1, ω 2 )d(p 1 P 2 ) = f(ω 1, ω 2 )P 1 (dω 1 ) P 2 (dω 2 ) = f(ω 1, ω 2 )P 2 (dω 2 ) P 1 (dω 1 ) Ω Ω 1 Ω 2 If f is P 1 P 2 -integrable, then (7.3) holds. Ω 2 Sketch of the proof. For f = I E, formula (7.3) is just the definition of the product measure. By linearity, the same formula holds for simple f. Now if f n f are simple, then functions Ω 1 f n (ω 1, ω 2 )P 1 (dω 1 ) are non-decreasing, so monotone convergence theorem (Theorem 6.4) gives (7.3). This ends the first part of the proof (the Tonelli s theorem). We omit the proof of the second part 1. Fubini s theorem is often a powerful computational tool. The following example illustrates its power and the value of the limit will be needed later on. Ω 1 1 After the standard decomposition f = f + f one needs to integrate over the set Ω 2 = {ω 2 : Ω 1 f(ω 1, ω 2 ) P 1 (dω 1 ) < } which may be smaller than Ω 2 but has P 2 -measure 1.
3. Fubini s Theorem 85 Example 7.2. The function f(x, u) = sin xe ux is integrable on (, t) (, ) as (by Tonelli s theorem) t ( ) t e ux sin x sin x du dx = dx t x By Fubini s theorem, t sin x t ( x dx = ) e ux sin xdu dx = = ( t ) e ux sin xdx du 1 1 + u 2 ( 1 e ut (u sin t + cos t) ) du Thus by the Lebesgue dominated convergence theorem lim t t (Omitted in 218) = π 2 s s sin t + t cos t e t 2 + s 2 ds sin x x dx = π/2. 3.1. Integration by parts. If F, G have no common points of discontinuity in (a, b] then G(x)dF (x) = F (b)g(b) F (a)g(a) F (x)g(dx) Proof. Write (a, b] (a, b] = + where = {(x, y) : a < y x b} and + = {(x, y) : a < x y b}. The product measure is (F (b) F (a))(g(b) G(a) = P ((a, b] (a, b]) = P ( ) + P ( + ) P ( + ) We note that + = {(x, x) : a < x b}. By Fubini s theorem P ( + ) = (F ({x})g(dx) = as F, G have no common point-mass atoms. The formula is now a calculation, using P ( ) = P ( + ) = (G(x) G(a))F (dx) (F (y) F (a))g(dy) 3.2. Tail integration formula. Formula (5.9) and its various generalizations are easy to derive from Fubini s theorem: if X then X p = pt p 1 I t<x dt so (7.4) E(X p ) = p t p 1 P (X > t)dt
86 7. Product measure and Fubini s theorem This formula holds true also in the non-integrable case - both sides are then. 3.3. Convolutions. Suppose X 1, X 2 are independent random variables with cumulative distribution functions F 1, F 2. Then the cumulative distribution function F X1 +X 2 (x) = P (X 1 + X 2 x) is given by (7.5) F X1 +X 2 (x) = F 1 (x u)f 2 (du) Proof. P (X 1 +X 2 x) = I u+v x P 1 P 2 (du, dv) = R 2 R R ( ) I u+v x P 1 (du) P 2 (dv) = R R F 1 (x v)p 2 (dv) Required Exercises Exercise 7.1. Suppose lim n n 2 P ( X > n) <. Prove that E X < Exercise 7.2. Suppose P ( X > n) 1/2 n for all n. Prove that there exists δ > such that Ee δ X <. Exercise 7.3. Use (7.5) to compute the cumulative distribution function for the sum of two independent uniform U(, 1) random variables. (Then differentiate to compute the density.) Exercise 7.4. Use (7.5) to compute the cumulative distribution function for the sum of two independent exponential (λ = 1) random variables. (Then differentiate to compute the density.) Exercise 7.5. Use Fubini s theorem (not tail integration formula from Theorem 6.2) to show that if X then E 1 1 + X = 1 P (X > t)dt (t + 1) 2 Then re-derive the same result from Theorem 6.2. Exercise 7.6. Use Fubini s theorem (not tail integration formula from Theorem 6.2) to show that if X then Ee X = 1 + Then re-derive the same result from Theorem 6.2. e t P (X > t)dt Additional Exercises Exercise 7.7. Use Fubini s Theorem to show that R (F (x + a) F (x))dx = a Exercise 7.8. Prove that for non-negative X and s > we have E(exp(sX)) = 1+s e st P (X > t)dt. Exercise 7.9. If F is continuous, prove that F (x)f (dx) = 1/2.
Bibliography [Billingsley] P. Billingsley, Probability and Measure IIIrd edition [Durrett] R. Durrett, Probability: Theory and Examples, Edition 4.1 (online) [Gut] A. Gut, Probability: a graduate course [Resnik] S. Resnik, A Probability Path, Birkhause 1998 [Proschan-Shaw] S M. Proschan and P. Shaw, Essential of Probability Theory for Statistitcians, CRC Press 216 [Varadhan] S.R.S. Varadhan, Probability Theory, (online pdf from 2) 135
Index L 1 metric, 11 L 2 metric, 11 L p-norm, 67 λ-system, 25 π-system, 25 σ-field, 16 σ-field generated by X, 45 distribution of a random variable, 46 Bernoulli random variables, 5 Binomial distribution, 17, 81 bivariate cumulative distribution function, 3 Bonferroni s correction, 18 Boole s inequality, 18 Borel σ-field, 45 Borel sigma-field, 16 Cantelli s inequality, 74 cardinality, 9 Cauchy distribution, 115 Cauchy-Schwarz inequality, 66 centered, 72 Central Limit Theorem, 119 characteristic function, 111 characteristic function continuity theorem, 115 Characteristic functions uniqueness, 114 Characteristic functions inversion formula, 114 Chebyshev s inequality, 65 complex numbers, 11 conjugate exponents, 68 continuity condition, 14 converge in L p, 69 converge in mean square, 69 convergence in distribution, 53, 11 converges in distribution, 125 converges in probability, 5 converges pointwise, 7 converges uniformly, 7 converges with probability one, 51 convex function, 66 copula, 56 correlation coefficient, 67 countable additivity, 14 covariance matrix, 128 cumulative distribution function, 26, 47 cylindrical sets, 32, 33 cyllindircal sets, 32 DeMorgan s law, 8 density function, 29, 82 diadic interval, 133 discrete random variable, 81 discrete random variables, 49 equal in distribution, 47 events, 13, 17 expected value, 62, 77 Exponential distribution, 82 exponential distribution, 29 Fatou s lemma, 79 field, 13 finite dimensional distributions, 32 finitely-additive probability measure, 14 Fubini s Theorem, 88 Geometric distribution, 81 Hölder s inequality, 68, 83 inclusion-exclusion, 18 independent σ-fields, 37 independent events, 37 independent identically distributed, 5 independent random variables, 48 indicator functions, 9 induced measure, 46 infinite number of tosses of a coin, 133 integrable, 77 intersection, 8 Jensen s inequality, 66 joint cumulative distribution function, 3 joint distribution of random variables, 47 Kolmogorov s maximal inequality, 93 Kolmogorov s one series theorem, 94
138 Index Kolmogorov s three series theorem, 95 Kolmogorov s two series theorem, 95 Kolmogorov s zero-one law, 93 Kolmogorov-Smirnov metric, 1, 11 Kronecker s Lemma, 95 Lévy distance, 17 law of X, 46 Lebesgue s dominated convergence theorem, 79, 8 Lebesgue s dominated convergence theorem used, 81, 92, 13, 116 Levy s metric, 11 Levy s theorem, 96 Lindeberg condition, 121 Lyapunov s condition, 122 Lyapunov s inequality, 66 marginal cumulative distribution functions, 3 Markov s inequality, 65 maximal inequality, Etemadi s, 96 maximal inequality,kolmogorov s, 93 mean square convergence, 84 measurable function, 45 measurable rectangle, 87 metric, 1 metric space, 1 Minkowki s inequality, 67 Minkowski s inequality, 83 moment generating function, 64, 85 moments, 63 Monotone Convergence Theorem, 77 multivariate normal, 127 multivariate normal distribution, 128 multivariate random variable, 45 Skorohod s theorem, 13 Slutsky s Theorem, 12 Standard normal density, 82 stochastic process with continuous trajectories, 47 stochastic processes, 46 stochastically bounded, 57 symmetric distribution, 98 tail σ-field, 38 Tail integration formula, 89 Taylor polynomials, 19 tight, 57 tight probability measure, 19 Tonelli s theorem, 88 total variation metric, 11 truncation of r.v., 56 uncorrelated, 72 uniform continuous, 28 Uniform density, 82 uniform discrete, 28 uniform singular, 28 uniformly integrable, 8, 15 union, 8 variance, 64 Waserstein distance, 11 weak convergence, 53 weak law of large numbers, 72 zero-one law, 38, 93 negative binomial distribution, 18 normal distribution, 29 Poisson distribution, 18, 81 Polya s distribution, 18 Portmanteau Theorem, 13 power set, 7 probability, 13 probability measure, 14 probability space, 13, 17 product measure, 88 quantile function, 48, 13 random element, 45 random variable, 45 random vector, 45 sample space, 13 Scheffe s theorem, 11 section, 87 semi-algebra, 15 semi-ring, 15 sigma-field generated by A, 16 simple random variable, 61 simple random variables, 49