Lecture 2: Random Variables and Expectation

Similar documents
Probability: Handout

I. ANALYSIS; PROBABILITY

Elementary Probability. Exam Number 38119

1 Measurable Functions

Advanced Probability

4th Preparation Sheet - Solutions

Measure and integration

Lecture 3: Expected Value. These integrals are taken over all of Ω. If we wish to integrate over a measurable subset A Ω, we will write

Lecture 5: Expectation

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure.

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

Lectures 22-23: Conditional Expectations

18.175: Lecture 3 Integration

Notes 1 : Measure-theoretic foundations I

Almost Sure Convergence of a Sequence of Random Variables

1 Stat 605. Homework I. Due Feb. 1, 2011

Lecture Notes for MA 623 Stochastic Processes. Ionut Florescu. Stevens Institute of Technology address:

STAT 712 MATHEMATICAL STATISTICS I

University of Regina. Lecture Notes. Michael Kozdron

Random experiments may consist of stages that are performed. Example: Roll a die two times. Consider the events E 1 = 1 or 2 on first roll

Martingale Theory and Applications

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

Fundamental Inequalities, Convergence and the Optional Stopping Theorem for Continuous-Time Martingales

Math 5051 Measure Theory and Functional Analysis I Homework Assignment 3

Summary of Real Analysis by Royden

MATH 418: Lectures on Conditional Expectation

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

MODULE 2 RANDOM VARIABLE AND ITS DISTRIBUTION LECTURES DISTRIBUTION FUNCTION AND ITS PROPERTIES

Lecture 3: Random variables, distributions, and transformations

ABSTRACT EXPECTATION

RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

LEBESGUE INTEGRATION. Introduction

CLASSICAL PROBABILITY MODES OF CONVERGENCE AND INEQUALITIES

1/12/05: sec 3.1 and my article: How good is the Lebesgue measure?, Math. Intelligencer 11(2) (1989),

4 Expectation & the Lebesgue Theorems

1. Probability Measure and Integration Theory in a Nutshell

If Y and Y 0 satisfy (1-2), then Y = Y 0 a.s.

Lebesgue Integration on R n

STAT 7032 Probability Spring Wlodek Bryc

2 Lebesgue integration

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

MATH 5616H INTRODUCTION TO ANALYSIS II SAMPLE FINAL EXAM: SOLUTIONS

4 Expectation & the Lebesgue Theorems

Stat 643 Review of Probability Results (Cressie)

THEOREMS, ETC., FOR MATH 515

van Rooij, Schikhof: A Second Course on Real Functions

Lecture 1: An introduction to probability theory

Lecture 21: Expectation of CRVs, Fatou s Lemma and DCT Integration of Continuous Random Variables

Solutions to Tutorial 1 (Week 2)

1 Probability theory. 2 Random variables and probability theory.

Measure Theory, Probability, and Martingales

Three hours THE UNIVERSITY OF MANCHESTER. 24th January

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Statistical Inference

17. Convergence of Random Variables

Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration

Introduction and Preliminaries

Exercises Measure Theoretic Probability

Analysis Finite and Infinite Sets The Real Numbers The Cantor Set

Economics 574 Appendix to 13 Ways

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

18.175: Lecture 2 Extension theorems, random variables, distributions

Probability and Measure. November 27, 2017

h(x) lim H(x) = lim Since h is nondecreasing then h(x) 0 for all x, and if h is discontinuous at a point x then H(x) > 0. Denote

1.1. MEASURES AND INTEGRALS

Probability and Measure

Homework 1 due on Monday September 8, 2008

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Probability and Random Processes

A D VA N C E D P R O B A B I L - I T Y

Product measures, Tonelli s and Fubini s theorems For use in MAT4410, autumn 2017 Nadia S. Larsen. 17 November 2017.

(1) Consider the space S consisting of all continuous real-valued functions on the closed interval [0, 1]. For f, g S, define

MATH5011 Real Analysis I. Exercise 1 Suggested Solution

Probability Theory. Richard F. Bass

Chapter 6. Integration. 1. Integrals of Nonnegative Functions. a j µ(e j ) (ca j )µ(e j ) = c X. and ψ =

Problem set 1, Real Analysis I, Spring, 2015.

Lecture 6 Basic Probability

Notes 13 : Conditioning

MTH 404: Measure and Integration

Lebesgue measure and integration

G1CMIN Measure and Integration

Exercise 1. Let f be a nonnegative measurable function. Show that. where ϕ is taken over all simple functions with ϕ f. k 1.

Week 12-13: Discrete Probability

Lecture 3: Probability Measures - 2

Iowa State University. Instructor: Alex Roitershtein Summer Homework #1. Solutions

Solution. 1 Solution of Homework 7. Sangchul Lee. March 22, Problem 1.1

IEOR 6711: Stochastic Models I SOLUTIONS to the First Midterm Exam, October 7, 2008

REAL VARIABLES: PROBLEM SET 1. = x limsup E k

Probability Theory II. Spring 2016 Peter Orbanz

Math-Stat-491-Fall2014-Notes-I

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

Annalee Gomm Math 714: Assignment #2

LEBESGUE MEASURE AND L2 SPACE. Contents 1. Measure Spaces 1 2. Lebesgue Integration 2 3. L 2 Space 4 Acknowledgments 9 References 9

STA 711: Probability & Measure Theory Robert L. Wolpert

1 Probability space and random variables

Lecture 11: Random Variables

The Subdifferential of Convex Deviation Measures and Risk Functions

(A n + B n + 1) A n + B n

Lecture 1: Review on Probability and Statistics

Measure Theoretic Probability. P.J.C. Spreij

Transcription:

Econ 514: Probability and Statistics Lecture 2: Random Variables and Expectation Definition of function: Given sets X and Y, a function f with domain X and image Y is a rule that assigns to every x X one (and only one) y Y. Notation: f : X Y, y = f(x) 1

Definition of random variable Let (Ω, A, P ) and be a probability space. A random variable X is a function X : Ω R such that for all B B with B the Borel σ-algebra E = {ω X(ω) B} A. The set E is also denoted as E = X 1 (B). This does not mean that X 1 exists, i.e. that X is a 1-1 function! See figure 2

A random variable X is a function that is in addition Borel measurable. Why we need this will be discussed later. Measurability is a more general concept. Because random variables are always functions to R we need only Borel measurability, because for R we always take the Borel σ-algebra. Often the function X can take the values and, i.e. X is a function the the extended real line R with the extended Borel σ-field B that contains all sets in B and the two points,. Why random variables? Often outcomes of a random experiment are complicated. Random variable summarizes (aspects of) an outcome in a single number. 3

Example: Three tosses of a single coin. Outcome space Ω = {HHH, HHT, HT H, T HH, HT T, T HT, T T H, T T T } Define X number of H in 3 tosses X(ω) = 3 if ω = HHH = 2 if ω = HHT,HTH,THH = 1 if ω = THT,TTH,HTT = 0 if ω = TTT For some random experiment we can define many random variables, e.g. in this example Y number of T before first H. 4

Measurability To establish measurability we use a generating class argument, because it is often easier to establish measurability on such a class. If E is a generating class for B, e.g. the intervals (, x] or (x, ), then we need only to show that X 1 (E) A for all E E. Proof: Define C = {B B X 1 (B) A}. We show that this is a σ-field. (i) C. (ii) Note X 1 (B c ) = X 1 (B) c, because by definition ω X 1 (B) c iff X(ω) / B iff X(ω) B c. Hence X 1 (B c ) A. (iii) Note X 1 ( i=1 B i) = i=1 X 1 (B i ), because ω X 1 ( i=1 B i) iff X(ω) i=1 B i. Hence, X 1 ( i=1 B i) A. Because C is a σ-field and E C, we have B = σ(e) C. Hence X 1 (B) A for all B B, so that X is Borel measurable. 5

Applications Let Ω = R, i.e. X : R R. If X is a continuous function, then X is Borel measurable, because if E is an open set (the open sets in R are a generating class), then X 1 (E) is also open and hence in B. Let X n, n = 1, 2,... be a sequence of random variables, then X sup = sup n X n and X inf = inf n X n are also random variables, i.e. they are Borel measurable functions. Note that X sup (ω) may be equal to for some ω (and X inf (ω) equal to ), i.e. they are B measurable. To see this note that (x, ) is a generating class for B, and that {ω X sup (ω) > x} = n {ω X n (ω) > x} The last union is clearly in A. For X inf take the generating sets (, x). If lim X n = X exists (may be ± ), then this is a random variable. To see this note that lim inf X n (ω) = sup n inf m n X m (ω) and lim sup X n (ω) = inf n sup m n X m (ω). From the previous result these are Borel measurable functions. We have lim inf X n(ω) X(ω) = lim X n (ω) lim sup X n (ω) Hence if the limit exists, it is equal to the liminf and limsup which are Borel measurable. 6

Let X and Y be random variables, then Z = X + Y is Borel measurable. Note that A = {ω Z(ω) > z} = x {ω X(ω) = x} {ω Y (ω) > z x} which involve an uncountable union. Because the countable set of rational numbers is a dense subset of R, for all ω with X(ω) > z Y (ω) there is a rational number r such that X(ω) > r > z Y (ω). Hence A = r {ω X(ω) > r} {ω Y (ω) > z r} which is a countable union. 7

We denote all Borel measurable functions X : Ω R by M and the subset of Borel measurable nonnegative functions by M +. A special class of nonnegative Borel measurable functions are the simple functions that can be written as n X(ω) = α i I Ai (ω) i with I A the indicator function of the event A and A i A, i = 1,..., n a partition of Ω and α i 0, i = 1,..., n constants. Each function in M + can be approximated by and increasing sequence of simple functions. Theorem 1 For each X in M +, the sequence of simple functions 4n X n (ω) = 2 n I (ω) X i 2 n is such that 0 X 1 (ω) X 2 (ω)... X n (ω) and X n (ω) X(ω) for all ω Ω. Proof: If X(ω) 2 n, then X n (ω) = 2 n If k2 n X(ω) < (k + 1)2 n for some k = 0, 1,..., 4 n 1, then X n (ω) = k2 n. See the figure for a graph and the claim follows. i=1 8

Expectation and integration Random experiment: Toss a coin twice Ω = {HH, HT, T H, T T } and these outcomes are equally likely Random variable: X is number of H in two tosses X takes values 0 (TT), 1 (TH,HT), and 2 (HH). You receive the uncertain return $X How much do you want to pay for this gamble if you are risk neutral? 9

Most people make the following computation: Consider a large number of repetitions of the random experiment. The relative frequency of values of X is 1/4 (X = 0), 1/2 (X = 1), 1/4 (X = 2). On average (over the repetitions) X is 0. 1 4 + 1.1 2 + 2.1 4 = 1 You are willing to pay $1 for the gamble. Call this the expected value of X, denoted by E(X). Direct computation Note that X is a nonnegative simple function for the partition A 1 = {T T }, A 2 = {HT, T H}, A 3 = {HH} and X(ω) = 0.I A1 (ω) + 1.I A2 (ω) + 2.I A3 (ω). E(X) = 0.P (A 1 ) + 1.P (A 2 ) + 2.P (A 3 ) = 1. In general if X(ω) = n i α ii Ai (ω) is a simple function, the expected value of X is n E(X) = α i P (A i ) i=1 Note that this is a weighted average of the values of X, the weights being the probabilities of these values. This suggests the notation E(X) = X(ω)dP (ω) = Ω XdP 10

How do you compute E(X) for a general random variable X? We use Theorem 1: Let X be a nonnegative random variable defined on the probability space (Ω, A, P ). By Theorem 1 there is an increasing sequence of simple functions X n that has limit X. This is why we need that X is Borel measurable to be able to define E(X). Define for simple functions X S E(X) = XdP = sup{e(x S ) X X S } X S Properties of E(X) (i) E(I A ) = P (A) for A A. (ii) E(0) = 0 with 0 the null function that assigns 0 to all ω Ω. 11

(iii) For α, β 0 and nonnegative Borel measurable functions X, Y E(αX + βy ) = αe(x) + βe(y ) This is the linearity of the expectation. Proof: Note that if X S, Y S are simple functions then so is Z S = X S + Y S. Also E(Z S ) = E(X S ) + E(Y S ). E(X)+E(Y ) = sup X S {E(X S ) X X S }+sup Y S {E(Y S ) Y Y S } = = sup X S,Y S {E(X S ) + E(Y S ) X X S, Y Y S } sup X S,Y S {E(X S + Y S ) X + Y X S + Y S } = = sup Z S {E(Z S ) X + Y Z S } = E(X + Y ) Next we prove E(X +Y ) E(X)+E(Y ). Let Z S be a simple function with Z S X+Y and let ε > 0. We construct simple functions X S X and Y S Y such that (1 ε)z S X S + Y S. We do the construction for Z S = I A. The general case is analogous. Take ε = 1 m and denote l j = j m. Define ( ) m X S (ω) = I A (ω) I X 1 (ω) + l j 1 I lj 1 X<l j (ω) Y S (ω) = I A (ω) j=1 m (1 l j )I lj 1 X<l j (ω) j=1 Obviously X S X. Now X(ω) + Y (ω) 1 for all ω Ω. Hence for l j 1 X < l j, we have Y > 12

1 l j = Y S. This holds for all j and hence Y S Y. Finally, because 1 l j + l j 1 = 1 ε, we have X S (ω) + Y S (ω) = m = I A (ω)i X 1 (ω)+(1 ε)i A (ω) I lj 1 X<l j (ω) (1 ε)i A (ω) Hence for all Z S X + Y j=1 E(X) + E(Y ) E(X S ) + E(Y S ) (1 ε)e(z S ) and if we take the sup over all Z S X + Y, we find E(X) + E(Y ) (1 ε)e(x + Y ). Because ε > 0 is arbitrary, this is still true if ε 0. From the definition it follows directly that for all α 0, E(αX) = αe(x). (iv) If X(ω) Y (ω) for all ω Ω, then E(X) E(Y ). Proof: E(X) = E(Y ) E(Y X) E(X) 13

(v) If X n X is an increasing sequence of nonnegative Borel measurable functions, then E(X n ) E(X). This is the monotone convergence property. Proof: Let X S = m i=1 α ii Ai be a simple function with X X S. Define the simple functions m X ns (ω) = (1 ε)α i I Ai (ω)i Xn (1 ε)α i (ω) i=1 Then X ns X n and E(X n ) E(X ns ) = (1 ε) m α i P (A i {ω X n (ω) (1 ε)α i }) i=1 Because X n X α i for ω A i, A i {ω X n (ω) (1 ε)α i } A i and hence P (A i {ω X n (ω) (1 ε)α i }) P (A i ). Hence for all X S X lim E(X n) (1 ε)e(x S ) Take the sup over all X S X and let ε 0 to obtain Because X n X, also lim E(X n) E(X) lim E(X n) E(X) 14

Extension to all random variables Until now E(X) only defined for nonnegative random variables. For arbitrary random variable X we can always write X(ω) = X + (ω) X (ω) with X + (ω) = max{x(ω), 0} and X (ω) = min{x(ω), 0}. Note X +, X are nonnegative. We define E(X) = E(X + ) E(X ) = Ω X + dp Ω X dp This is well-defined unless E(X + ) = E(X ) =. To avoid this we can require E(X + ) <, E(X ) < or E( X ) <. A random variable X with E( X ) < is called integrable. Application: Jensen s inequality. A function f : R R is convex if for all 0 < λ < 1, f(λx 1 + (1 λ)x 2 ) λf(x 1 ) + (1 λ)f(x 2 ). If f is convex E = {x f(x) t} is a convex subset of R and hence an interval. Hence, f is Borel measurable. Proof: x 1, x 2 E, then f(λx 1 + (1 λ)x 2 ) λf(x 1 ) + (1 λ)f(x 2 ) t. For all x, x 0, f(x) f(x 0 ) + α(x x 0 ) with α a constant that may depend on x 0. 15

Note f(x) f(x 0 ) + α(x x 0 ) f(x 0 ) α ( x + x 0 ) Hence E(f(X) ) < if X is integrable and E(f(X) is well-defined. Take x 0 = E(X) to obtain E(f(X)) f(e(x))+α(e(x) E(X)) = f(e(x)) 16

Lebesgue integrals The expectation of X is the integral of X w.r.t. to the probability measure P. The same definition applies if P is replaced by a measure µ, i.e. if the condition that µ(ω) = 1 is dropped and replaced by µ( ) = 0 (the other conditions remain). Special case is Lebesgue measure, defined by m([a, b]) = b a. This is the length of the interval. It implies m((a, b)) = b a (because the Lebesgue measure of a point is 0) and because the open intervals are a generating class the definition can be uniquely extended to all sets in the Borel field B. The integral of Borel measurable f : R R w.r.t. Lebesgue measure is denoted by f(x)dx. The notation is the same as the (improper) Riemann integral of f. If f is integrable, i.e. if the Lebesgue integral f(x) dx <, then the Lebesgue integral is equal to the Riemann integral if the latter exists. If f(x) dx =, the improper Riemann integral lim t t t f(x)dx may exist, while the Lebesgue integral is not defined. Except for this special case you can compute Lebesgue integrals with all the calculus tricks. The theory of Lebesgue integration is easier than that of Riemann integration, in particular if order or integration and limit or integration and differentiation has to be interchanged. 17

Integration and limits Often we have a sequence of random variables X n, n = 1, 2,... and we need to know lim E(X n ) = lim Xn dp. Can we interchange limit and integral? We want to take the derivative w.r.t. of E(f(X, t)) = inf f(x, t)dp. Can we interchange differentiation and integration? What can go wrong: Consider the probability space [0, 1], B[0, 1], P ) with B[0, 1] the σ-field obtained by the intersections of the sets in B with [0, 1] and P ((a, b)) = b a. Define the sequence X n (ω) = n 2 I (0, 1 n ) (ω). X n (ω) 0 for all 0 ω 1, but E(X n ) = n. lim E(X n ) = 0 = E(lim X n ) Theorem 2 (Fatou s Lemma) Let X n be a sequence of nonnegative random variables (need not converge), then E(lim inf X n) lim inf E(X n) Proof: Remember lim inf X n = lim inf m n X m. Define Y n = inf m n X m. We have for all n, Y n X n. Moreover, Y n is an increasing sequence of nonnegative random variables, by monotone convergence E(lim inf X n ) = lim E(Y n ). Finally, because E(X n ) E(Y n ), we have lim E(Y n ) lim inf E(X n ). 18

Theorem 3 (Dominated convergence) Let X n be a sequence of integrable random variables and let the limit lim X n (ω) = X(ω) exist for all ω Ω. If there is a nonnegative integrable random variable Y such that X n (ω) Y (ω) for all ω Ω and all n, then X is integrable and lim E(X n ) = E(X). Proof: X Y and hence X is integrable. Consider the sequences Y +X n and Y X n that are both nonnegative and integrable. By Fatou s lemma E(Y +X) = E(lim inf (Y +X n)) E(Y )+lim inf E(X n) E(Y X) = E(lim inf (Y X n)) E(Y ) lim sup E(X n ) because lim inf X n = lim sup X n. Cancel E(Y ) to obtain lim sup E(X n ) E(X) lim inf E(X n) 19

Application: Let f(x, t) be an integrable random variable for δ < t < δ with δ > 0, let f(x, t) be differentiable in t on that interval and for all x. Consider the partial derivative with respect to t and assume for all x and δ < t < δ f(x, t) t with M(X) an integrable random variable. Hence by the mean value theorem f(x, t) f(x, 0) t f(x, t(x)) t M(x) with t(x) = λ(x)t for some 0 λ(x) 1. Define the sequence of random variables f(x, t n ) f(x, 0) t n with t n 0. We have ( ) f(x, lim E tn ) f(x, 0) t n = lim E(f(X, t n )) E(f(X, 0)) t n By dominated convergence we can interchange the limit and the expectation (integration), so that ( ) E f(x, t) = E(f(X, t)) t t 20

Sets of measure 0 In integrals/expactations sets E A with P (E) = 0 can be neglected. Theorem 4 If the random variables X and Y are such that E = {ω X(ω) Y (ω)} with P (E) = 0, then E(X) = E(Y ). Proof: If n is sufficiently large then X(ω) Y (ω) + n.i X Y (ω). Because the sequence on the rhs is increasing, we have by monotone convergence ( E(X) E lim (Y + n.i X Y ) Interchange X and Y to obtain, E(Y ) E(X). ) = lim E(Y +n.i X Y )) = E(Y ) 21