Lecture 5: Expectation 1. Expectations for random variables 1.1 Expectations for simple random variables 1.2 Expectations for bounded random variables 1.3 Expectations for general random variables 1.4 Expectation as a Lebesgue integral 1.5 Riemann and Riemann-Stiltjes integral 2. Expectation and distribution of random variables 2.1 Expectation for transformed discrete random variables 2.2 Expectation of transformed continuous random variables 2.3 Expectation for a product of independent random variables 2.4 Moments of higher order 1. Expectations for random variables 1.1 Expectations for simple random variables < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable. P =< A 1,..., A n > is a partition of the sample space Ω, i.e., a family of sets such that (a) A 1,..., A n F, (b) A 1 A n = Ω, (c) A i A j =, i j. Definition 5.1. A random variable X is called a simple random variable if there exists a partition P = {A 1,..., A n } of 1
the sample space Ω and real numbers x 1,..., x n such that X(ω) = n x i I Ai (ω) = i=1 x 1 if ω A 1,... x n if ω A n. Definition 5.2. If X is a simple random variable, then its expectation (expected value) is defined as EX = n x i P (A i ). i=1 Notations that are often used EX = E[X] = E(X). Examples (1) Let X = X(ω) = M, ω Ω, where M is a constant. Since < Ω > is a partition, EX = MP (Ω) = M. (2) Let I A = I A (ω) is a indicator of a random event A, i.e. a random variables that takes values 1 and 0 on the sets A and A, respectively. Since < A, A > is a partition, EI A = 1P (A) + 0P (A) = P (A). Expectation EX of a simple random variable always exists (take a finite value) and possesses the following properties: (1) If X = n i=1 x i I Ai and Y = m j=1 y j I Bj are two simple random variables and a and b are any real numbers, then Z = ax + by is also a simple random variable and EZ = aex + bey. 2
(a) If {A 1,..., A n } and {B 1,..., B m } are two partition of Ω then {A i B j, i = 1,..., n, j = 1,..., m} is also a partition of of Ω; (b) Z = ax + by = n i=1 mj=1 (ax i + by j )I Ai B j ; (c) EZ = n i=1 mj=1 (ax i + by j )P (A i B j ) = n i=1 mj=1 ax i P (A i B j ) + n i=1 mj=1 by j P (A i B j ) = a n i=1 x i mj=1 P (A i B j ) + b m j=1 y j ni=1 P (A i B j ) = a n i=1 x i P (A i ) + b m j=1 y j P (B j ) = aex + bey. (2) If X = n i=1 x i I Ai is a simple random variable such that P (X 0) = 1 then EX 0. P (X 0) = 1 implies that P (A i ) = 0 if x i < 0. In this case EX = i:x i 0 x i P (A i ) 0. (2 ) If X and Y are two simple random variables such that P (X Y ) = 1 then EX EY. P (X Y ) = 1 P (Y X 0) = 1 E(Y X) = EY EX 0. 3
1.2 Expectations for bounded random variables < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable. Definition. A random variable X is bounded if there exists a constant M such that X(ω) M for every ω Ω. Examples (1) If Ω = {ω 1,..., ω N } is a finite sample space that any random variable X = X(ω) is bounded. (2) If Ω = {ω = (ω 1, ω 2,...), ω i = 0, 1, i = 1, 2,...} is the sample space for infinite series of Bernoulli trials then the random variable X = X(ω) = min(n 1 : ω n = 1) (the number of the first successful trial) is an unbounded random variable while the random variable Y = Y (ω) = ω 1 + + ω n (the number of successes in first n trials is a bounded random variable. Definition. If X is a bounded random variable, then its expectation is defined as EX = sup EX = inf X X X X EX, where supremum is taken over simple random variables X X while infimum is taken over simple random variables X X. To be sure that the definition is meaningful we should prove that sup and inf in the above definition are equal. 4
( a) The inequality sup X X EX inf X X EX holds because of any two simple random variables X X and X X are connected by the relation X X and therefore EX EX. (b) Let X(ω) < M. Fix a number n and define A i = {ω Ω : (i 1)M n < X(ω) im n }, n i n. Note that A i F, i = n,..., n. Define the simple random variables X n = n i= n (i 1)M n I Ai, X n = n i= n im n I A i. By the definition, X n X X n. Moreover, X n X n = M n and, therefore EX n EX n = M n. Thus, inf X X EX EX n = EX n + M n sup EX + M X X n. Since, n is an arbitrary, the relation above implies that inf X X EX sup EX. X X (c) By (a) and (b) we get sup X X EX = inf X X EX. Expectation of a bounded random variable EX always exists (take a finite value) and possess the properties similar with those for expectation of a simple random variable: 5
(1) If X and Y are two bounded random variables and a and b are any real numbers, then Z = ax + by is also a bounded random variable and EZ = aex + bey. (a) Let first prove that EaX = aex. The case a = 0 is trivial. The case a < 0 is reduced to the case a > 0 by considering the random variable X. If a > 0, then EaX = sup ax ax EaX = sup X X aex = a sup X X EX = aex. (b) The prove in (1) can be reduced to the case a = b = 1 by considering the random variables ax and by. We have sup Z Z=X+Y EZ sup X X,Y Y E(X + Y ), since X X and Y Y implies Z = X + Y Z = X + Y and thus the supremum on the right hand side in the above inequality is actually taken over a smaller set. (c) Using (b) we get EZ = E(X + Y ) EX + EY. Indeed, EZ = E(X + Y ) = sup Z Z=X+Y EZ sup X X,Y Y = sup (EX + EY ) = sup EX + sup X X,Y Y X X Y Y E(X + Y ) EY = EX + EY. (d) The reverse inequality follows by considering the random variables X and Y. 6
(2) If X is a bounded random variable such that P (X 0) = 1 then EX 0. Let denote A = {ω : X(ω) 0}. Let also M ba a constant that bounds X. Then X(ω) X 0, ω Ω where X 0 = 0I A (ω) + ( M)I A (ω) = MI A (ω) is a simple random variable. Then EX = sup EX EX 0 = MP (A) = 0. X X (2 ) If X and Y are two bounded random variables such that P (X Y ) = 1 then EX EY. 1.3 Expectations for general random variables < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable. Definition. If X = X(ω) 0, ω Ω, i.e., X is a nonnegative random variable, then EX = sup EX, X X where supremum is taken over all bounded random variables such that 0 X X. The expectation EX of a non-negative random variable can take non-negative values or to be equal to infinity. 7
Any random variable X can be decomposed in the difference of two non-negative random variables X + = max(x, 0) and X = max( X, 0) that is X = X + X. Definition. If X is integrable, i.e., E X < then the its expectation is defined as, EX = EX + EX. Definition is correct since 0 X +, X X and since X is an integrable, 0 EX +, EX <. Expectation of a random variable EX possess the properties similar with those for expectation of a simple and bounded random variables: (1) If X and Y are two integrable random variables and a and b are any real numbers, then Z = ax + by is also an integrable random variable and EZ = aex + bey. (a) Let first prove that EaX = aex for the case where a 0 and X 0 and one should count the product aex = 0 if a = 0, EX = and aex = if a > 0, EX =. The case a = 0 is trivial since in this case ax 0 and therefore EaX = 0. If a > 0 then EaX = sup ax ax EaX = sup X X aex = a sup X X EX = aex. (b) Let first prove that EaX = aex for an integrable random variable X. In this case, the case a 0 can be reduced to the 8
case a 0 by considering the random variable X. If a > 0 then EaX = E(aX) + E(aX) = aex + aex = aex. (c) The prove in (1) for X, Y 0 can be reduced to the case a = b = 1 by considering the random variables ax and by. We have sup Z Z=X+Y EZ sup X X,Y Y E(X + Y ), since X X and Y Y implies Z = X + Y Z = X + Y and thus the supremum on the right hand side in the above inequality is actually taken over a smaller set. (d) Using (c) we get EZ = E(X +Y ) EX +EY for X, Y 0. Indeed EZ = E(X + Y ) = sup Z Z=X+Y EZ sup X X,Y Y = sup (EX + EY ) = sup EX + sup X X,Y Y X X Y Y E(X + Y ) EY = EX + EY. (e) To prove EZ = E(X + Y ) EX + EY for X, Y 0 let us the inequality for non-negative bounded random variables min(x + Y, n) min(x, n) + min(y, n). This implies and in sequel E min(x + Y, n) E min(x, n) + E min(y, n). EZ = E(X + Y ) = = max n 1 sup Z Z=X+Y EZ = max n 1 sup Z min(x+y,n) EZ E min(x + Y, n) max(e min(x, n) + E min(y, n)) n 1 9
max n 1 E min(x, n) + max E min(y, n) = EX + EY. n 1 (f) Finally, to prove EZ = E(X + Y ) = EX + EY for arbitrary integrable random variables X and Y let us define a random variable Z with the positive part Z + = X + + Y + and the negative part Z = X + Y. We have E(X + Y ) = E(X + X + Y + Y ) = E(Z + Z ) = EZ + EZ = (EX + + EY + ) (EX + EY ) = EX + EY. (2) If X is a random variable such that P (X 0) = 1 then EX 0. Since X 0 0 is a non-negative bounded random variable EX = sup EX EX 0 = 0. X X (2 ) If X and Y are two random variables such that P (X Y ) = 1 then EX EY. 1.4 Expectation as a Lebesgue integral < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable defined on the probability space < Ω, F, P >. 10
In fact, EX, as it was defined above, is the Lebesgue integral for the real-valued function X = X(ω) with respect to measure P (A) and, therefore, according notations used in the integration theory, EX = X(ω)P (dω). Ω Also the following notations are used EX = Ω Ω X(ω)P (dω) = XdP = XdP. Definition 5.3. A finite measure Q(A) defined on σ-algebra F is a function that can be represented as Q(A) = qp (A), where P (A) is a probability measure defined on F and q > 0 is a positive constat. Definition 5.4. The Lebesque integral Ω XdQ is defined by the following formula Ω XdQ = q Ω XdP. Examples (1) Lebesque measure m(a) on the Borel σ-algebra of an interval [c, d] which is uniquely determined by its values on intervals m((a, b]) = b a, c a b d. It can be represented in the form m(a) = qp (A) where q = d c and P (A) is a probability measure on the Borel σ-algebra of an interval 11
[c, d], which is uniquely determined by its values on intervals P ((a, b]) = b a d c, c a b d. (2) According the above definition [c,d] Xdm = d c Xdm = q [c,d] XdP = qex, where X should be considered as a random variable defined on the probability space < Ω = [c, d], F = B([c, d]), P (A) >. Definition. A σ-finite measure Q(A) defined on σ-algebra F is a function of sets for which there exists a sequence of Ω n F, Ω n Q n+1, n = 1, 2,..., n Ω n = Ω such that Q(Ω n ) <, n = 1, 2,... and Q(A) = lim n Q(A Ω n ). Definition. The Lebesque integral Ω XdQ is defined for a random variable X = X(ω) and a σ-finite measure Q, under condition that Ω n X dq <, n = 1, 2,... and lim n Ω n X dq <, by the following formula Examples Ω XdQ = lim XdQ. n Ω n (1) Lebesque measure m(a) on the Borel σ-algebra of an interval R 1 which is uniquely determined by its values on intervals m((a, b]) = b a, a b. It can be represented in the form m(a) = lim n m(a [ n, n]), where m(a [ n, n]) is Lebesgue measure on the interval [ n, n] for every n. (2) According the above definition R 1 Xdm = Xdm = lim n n n Xdm under condition that n n X dm <, n = 1, 2,... and lim n n n X dm <. 12
1.5 Riemann amd Riemann-Stiltjes integrals f(x) is a real valued function defined on a real line; [a, b]; a = x n,0 < x n,1 < < x n,n = b; d(n) = max 1 k n (x n,k x n,k 1 ) 0 as n ; x n,k [x n,k 1, x n,k ], k = 1,..., n, n = 1, 2,...; S n = n f(x n,k)(x n,k x n,k 1 ). k=1 Definition 5.5 Riemann integral b a f(x)dx exists if and only if there exists the same lim n S n for any choice of partitions such that d(n) 0 as n and points x n,k. In this case b a f(x)dx = lim n S n. Definition 5.6 If function f is bounded and Riemann integrable on any finite interval, and lim n n n f(x) dx <, then function f is Riemann integrable on real line and f(x)dx = lim n n n f(x)dx. Theorem 5.1*. A real-valued bounded Borel function f(x) defined on a real line is Riemann integrable on [a, b] if and only if its set of discontinuity points R f [a, b] has Lebesgue measure m(r f [a, b]) = 0. Theorem 5.2*. If Ω = R 1, and F = B 1 and f = f(x) is a Riemann integrable function, i.e., f(x) dx <. Then the 13
Lebesgue integral f(x) m(dx) < and f(x)dx = f(x)m(dx). Example Let D be the set of all irrational points in interval [a, b]. The function I D (x), a x b is a bounded Borel function which is discontinuous in all points of the interval [a, b]. It is not Riemann integrable. But it is Lebesgue integrable since it is a simple function and [a,b] I D (x)m(dx) = 0 m([a, b] \ D) + 1 m(d) = 0 0 + 1 (b a) = b a. f(x) is a real valued function defined on a real line; G(t) is a real-valued non-decreasing and continuous from the right function defined on a real line; G(A) be a measure uniquely defined by function G(x) by relations G((a, b]) = G(b) G(a), < a b <. [a, b]; a = x n,0 < x n,1 < < x n,n = b; d(n) = max 1 k n (x n,k x n,k 1 ) 0 as n ; x n,k [x n,k 1, x n,k ], k = 1,..., n, n = 1, 2,...; S n = n f(x n,k)(g(x n,k ) G(x n,k 1 ). k=1 Definition 5.7 Riemann-Stiltjes integral b a f(x)dg(x) exists if and only if there exists the same lim n S n for any choice of partitions such that d(n) 0 as n and points x n,k. In this case b f(x)dg(x) = lim S n n. a 14
Definition 5.8 If function f is bounded and Riemann-Stiltjes integrable on any finite interval, and lim n n n f(x) dg(x) <, then function f is Riemann-Stiltjes integrable on real line and f(x)dg(x) = lim n n n f(x)dg(x). Theorem 5.3*. A real-valued bounded Borel function f(x) defined on a real line is Riemann-Stiltjes integrable on [a, b] if and only if its set of discontinuity points R f [a, b] has the measure G(R f [a, b]) = 0. Theorem 5.4*. If Ω = R 1, and F = B 1 and f = f(x) is a Riemann-Stiltjes integrable function, i.e., f(x) dg(x) <. Then the Lebesgue integral f(x) G(dx) < and f(x)dg(x) = f(x)g(dx). 2. Expectation and distribution of random variables 2.1 Expectation for transformed discrete random variables < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable defined on the probability space < Ω, F, P >. g(x) is a Borel real-valued function defined on a real line. Y = g(x) is a transformed random variable. 15
Definition 5.9. A random variable X is a discrete random variable if there exists a finite or countable set of real numbers {x n } such that n p X (x n ) = 1, where p X (x n ) = P (X = x n ) = 1. Theorem 5.5**. Let X be a discrete random variable. Then Examples EY = Eg(X) = Ω g(x(ω))p (dω) = n g(x n )p X (x n ). (1) Let Ω = {ω 1,..., ω N } is a discrete sample space, F = F 0 is the σ-algebra of all subsets of Ω and P (A) is a probability measure, which is given by the formula P (A) = ω i A p i, where p(ω i ) = P (A i ) 0, i = 1,... N are probabilities of one-points events A i = {ω i } satisfying the relation ω i Ω p(ω i ) = 1. A random variable X = X(ω) and a transformed random variable Y = g(x) are, in this case, simple random variables since < A 1,... A N > is a partition of Ω and X = ω i Ω X(ω i )I Ai and Y = ω i Ω g(x(ω i ))I Ai. In this case, p X (x j ) = P (X = x j ) = p(ω i ) ω i :X(ω i )=x j and, according the definition of expectation and Theorem 1, EY = Eg(X) = ω i Ω g(x(ω))p(ω i ) = n g(x n )p X (x n ). (2) Let Ω = {ω = (ω 1,..., ω n )}, ω i = 0, 1, i = 1,..., n} is a discrete sample space, for series of n Bernoulli trials. In this case p(ω) = n i=1 p ω i q1 ω i where p, q > 0, p + q = 1. 16
Let X(ω) = ω 1 + + ω n be the number of successes in n trials. In this case,x j = j, j = 0,..., n and p X (j) = P (X = j) = where Cn j = n! and Theorem 3, j!(n j)! ω:x(ω)=j p j q n j = C j np j q n j, j = 0,..., n,, and, according the definition of expectation EX = X(ω)p(ω) = n jp X (j) = np. ω Ω j=0 (3) Let X is a Poisson random variable, i.e., P X (n) = e λ λ n n!, n = 0, 1,.... Then, EX = n e λ λ n = λ. n! n=0 2.2 Expectation for transformed continuous random variables < Ω, F, P > is a probability space; X = X(ω) is a real valued random variable defined on the probability space < Ω, F, P >. P X (A) = P (X A) and F X (x) = P (X x) are, respectively, the distribution and the distribution function for the random variable X. g(x) is a Borel real-valued function defined on a real line. Y = g(x) is a transformed random variable. 17
Theorem 5.6**. Let X be a random variable. Then EY = Eg(X) = Ω g(x(ω))p (dω) = g(x)p X(dx). Definition 5.10 A random variable X is a continuous random variable if there exists a non-negative Borel function f X (x) defined on a real line such that f(x)m(dx) = 1 such that F X (x) = x f X(y)m(dy), x R 1. The function f X (x) is called the probability density of the random variable X (or the distribution function F X (x)). According Theorem 5.2, if f X (x) is a Riemann integrable function, i.e., f(x)dx = 1 then F X (x) = x f X(y)m(dy) = x f X(y)dy, x R 1. Theorem 5.7**. Let X be a continuous random variable with a probability density f. Then = EY = Eg(X) = g(x)p X(dx) = Ω g(x(ω))p (dω) g(x)f(x)m(dx). According Theorem 5.2, if g(x)f X (x) is a Riemann integrable function, i.e., g(x)f X (x) dx < then EY = Eg(X) = Ω g(x(ω))p (dω) = g(x)p X(dx) 18
= g(x)f X(x)m(dx) = g(x)f X(x)dx. Examples (1) Let Ω = [0, T ] [0, T ], F = B(Ω), m(a) is the Lebesgue measure on B(Ω), which is uniquely determined by its values on rectangles m([a, b] [c, d]) = (b a)(d c) (m(a) is the area of a Borel set A). Let also the corresponding probability measure P (A) = m(a) T. Let the random variable X(ω) = ω 2 1 ω 2, ω = (ω 1, ω 2 ) Ω. Find the EX. (1 ) EX = 1 T 2 = 1 T 2 Ω ω 1 ω 2 m(dω) [0,T ] [0,T ] ω 1 ω 2 dω 1 dω 2 =? (1 ) The distribution function F X (x) = P (X x) = T 2 (T x) 2 T 2 = 1 (1 x T )2, 0 x T. It has a continuous (and, therefore, Riemann integrable probability density) f X (x) = 2 T (1 x T ), 0 x T. Thus, EX = T 0 x 2 T (1 x T )dx = T 3. (2) Let X = X(ω) be a random variable defined on a probability space < Ω, F, P > with the distribution function F (x) = P (X x) and the distribution F (A) = P (X A). Then EX = Ω X(ω)P (dω) = xf (dx) = xdf (x) R 1. (3) Let X be a non-negative random variable. Then the above formula can be transformed to the following form EX = [0, ) xf (dx) = xdf (x) = (1 F (x))dx. 0 19 0
(a) 0 xdf (x) = lim A0 A xdf (x); (b) 0 (1 F (x))dx = lim A A0 (1 F (x))dx; (c) A 0 xdf (x) = A(1 F (A)) + A 0 (1 F (x))dx; (d) 0 (1 F (x))dx < 0 (e)a(1 F (A)) A xdf (x); xdf (x) < (f) 0 xdf (x) < 0 (1 F (x))dx < ; 2.3 Expectation for product of independent random variables Theorem 5.8. If X and Y are two independent random variables and E X, E Y <. Then E XY < and EXY = EXEY. (g) (a) - (c) 0 xdf (x) = 0 (1 F (x))dx. (a) Let X = n i=1 x i I Ai and Y = m j=1 y j I Bj are two simple independent random variables. Then XY = n mj=1 i=1 x i y j I Ai B j is also a simple random variable and, therefore, EXY = n m i=1 j=1 = n i=1 x i y j P (A i B j ) = n m i=1 j=1 x i P (A i ) m y j P (B j ) = EXEY. j=1 x i y j P (A i )P (B j ) 20
(b) The proof for bounded and general random variables analogous to those proof given for the linear property of expectations. 2.4 Moments of higher order Let X = X(ω) be a random variable defined on a probability space < Ω, F, P > with the distribution function F X (x) = P (X x) and the distribution F X (A) = P (X A); Let also Y = X n and the distribution function F Y (y) = P (X n y) and the distribution F Y (A) = P (X n A); Definition 5.11 The moment of the order n for the random variable X is the expectation of random variable Y = X n. EX n = Ω X(ω)n P (dω) = = LN Problems R 1 yf Y (dy) = R 1 x n F X (dx) = ydf Y (y) xn df X (x) 1. Let X is a discrete random variable that taking nonnegative integer values 0, 1, 2,.... Prove that EX = n=1 P (X n). 2. Let X is a non-negative random variable and F (x) = P (X x). Prove that EX n = n x n 1 (1 F (x))dx. 0 21
3. Let X be a geometric random variable that take values n = 0, 1,... with probabilities P (X = n) = qp n 1, n = 0, 1,.... Please find: (a) P (X n); (b) EX. 4. The random variable X has a Poisson distribution with parameter λ > 0. Please find E 1 1+X. 5 Let X 1,..., X n be independent random variables uniformly distributed in the interval [0, T ] and Z n = max(x 1,..., X n ). Please find: (a)p (Z n x), (b) EZ n, (c) E(Z n T ) 2. 6 Let X 1,..., X n be independent random variables uniformly distributed in the interval [0, T ] and Y n = 2 X 1+ +X n n. Please find: (a) EY n ; (b) E(Y n T ) 2. 7 Let V arx = E(X EX) 2 <. Please prove that (a) V arx = EX 2 (EX) 2, (b) V arx = inf a R1 E(X a) 2. 8 Let X and Y are independent random variables with V arx, V ary <. Please prove that V ar(x + Y ) = V arx +V ary. 9 Let X 0 is a continuous non-negative random variable with EX 2 <. Please prove that EX2 = 0 x 2 f X (x)dx = 2 0 x(1 F X (x))dx. 10 Let a random variable X has an exponential distribution F X (x) = I(x 0)(1 e λx ). Please find EX and V arx = E(X EX) 2. 22