RS Chapter 1 Random Variables 6/5/2017. Chapter 1. Probability Theory: Introduction

Similar documents
Chapter 1: Probability Theory Lecture 1: Measure space and measurable function

Chapter 1: Probability Theory Lecture 1: Measure space, measurable function, and integration

18.175: Lecture 2 Extension theorems, random variables, distributions

1.1. MEASURES AND INTEGRALS

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

Measure and integration

Lebesgue measure on R is just one of many important measures in mathematics. In these notes we introduce the general framework for measures.

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

II - REAL ANALYSIS. This property gives us a way to extend the notion of content to finite unions of rectangles: we define

Measure and Integration: Concepts, Examples and Exercises. INDER K. RANA Indian Institute of Technology Bombay India

Analysis of Probabilistic Systems

Chapter 4. Measure Theory. 1. Measure Spaces

Integration on Measure Spaces

STAT 7032 Probability Spring Wlodek Bryc

Statistical Inference

Measure Theoretic Probability. P.J.C. Spreij

Lecture 3: Probability Measures - 2

Measures. Chapter Some prerequisites. 1.2 Introduction

The main results about probability measures are the following two facts:

Basic Measure and Integration Theory. Michael L. Carroll

Measures and Measure Spaces

Lebesgue Measure. Dung Le 1

Review of measure theory

Construction of a general measure structure

Measure Theoretic Probability. P.J.C. Spreij

Probability Theory Review

Recitation 2: Probability

Notes 1 : Measure-theoretic foundations I

THEOREMS, ETC., FOR MATH 515

36-752: Lecture 1. We will use measures to say how large sets are. First, we have to decide which sets we will measure.

MATHS 730 FC Lecture Notes March 5, Introduction

Chapter 1. Measure Spaces. 1.1 Algebras and σ algebras of sets Notation and preliminaries

Almost Sure Convergence of a Sequence of Random Variables

2 (Bonus). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

(2) E M = E C = X\E M

3 (Due ). Let A X consist of points (x, y) such that either x or y is a rational number. Is A measurable? What is its Lebesgue measure?

Measures. 1 Introduction. These preliminary lecture notes are partly based on textbooks by Athreya and Lahiri, Capinski and Kopp, and Folland.

Lecture 9: Conditional Probability and Independence

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

2 Measure Theory. 2.1 Measures

Theorem 2.1 (Caratheodory). A (countably additive) probability measure on a field has an extension. n=1

The Caratheodory Construction of Measures

Annalee Gomm Math 714: Assignment #2

MATH & MATH FUNCTIONS OF A REAL VARIABLE EXERCISES FALL 2015 & SPRING Scientia Imperii Decus et Tutamen 1

The Lebesgue Integral

Lecture 11: Random Variables

1 Measurable Functions

Statistical Theory 1

1.4 Outer measures 10 CHAPTER 1. MEASURE

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Chapter 1. Sets and probability. 1.3 Probability space

+ = x. ± if x > 0. if x < 0, x

Summary of Real Analysis by Royden

2 Probability, random elements, random sets

Lebesgue Measure on R n

Module 1. Probability

Probability as Logic

Topic 5 Basics of Probability

Random Process Lecture 1. Fundamentals of Probability

MATH/STAT 235A Probability Theory Lecture Notes, Fall 2013

AN ALGEBRAIC APPROACH TO GENERALIZED MEASURES OF INFORMATION

02. Measure and integral. 1. Borel-measurable functions and pointwise limits

Real Analysis Chapter 1 Solutions Jonathan Conder

Measure Theoretic Probability. P.J.C. Spreij. (minor revisions by S.G. Cox)

Probability. Lecture Notes. Adolfo J. Rumbos

Independent random variables

Real Analysis Notes. Thomas Goller

(U) =, if 0 U, 1 U, (U) = X, if 0 U, and 1 U. (U) = E, if 0 U, but 1 U. (U) = X \ E if 0 U, but 1 U. n=1 A n, then A M.

I. ANALYSIS; PROBABILITY

Lecture 1: Overview of percolation and foundational results from probability theory 30th July, 2nd August and 6th August 2007

18.175: Lecture 3 Integration

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Probability and Random Processes

Chapter 5. Measurable Functions

Continuum Probability and Sets of Measure Zero

STAT 712 MATHEMATICAL STATISTICS I

Measure Theory. John K. Hunter. Department of Mathematics, University of California at Davis

Probability and Measure

Introduction and Preliminaries

Probability and Measure. November 27, 2017

Homework 11. Solutions

Notes on the Lebesgue Integral by Francis J. Narcowich November, 2013

Lebesgue Integration on R n

Math 4121 Spring 2012 Weaver. Measure Theory. 1. σ-algebras

Indeed, if we want m to be compatible with taking limits, it should be countably additive, meaning that ( )

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

MATH41011/MATH61011: FOURIER SERIES AND LEBESGUE INTEGRATION. Extra Reading Material for Level 4 and Level 6

Part II: Markov Processes. Prakash Panangaden McGill University

STA 711: Probability & Measure Theory Robert L. Wolpert

Lecture 2: Random Variables and Expectation

Math-Stat-491-Fall2014-Notes-I

Stat 451: Solutions to Assignment #1

Review of Probability. CS1538: Introduction to Simulations

Statistics 1 - Lecture Notes Chapter 1

Statistics and Econometrics I

Probability and Measure

JUSTIN HARTMANN. F n Σ.

FUNDAMENTALS OF REAL ANALYSIS by. II.1. Prelude. Recall that the Riemann integral of a real-valued function f on an interval [a, b] is defined as

Lebesgue Measure on R n

Lebesgue Integration: A non-rigorous introduction. What is wrong with Riemann integration?

Transcription:

Chapter 1 Probability Theory: Introduction Basic Probability General In a probability space (Ω, Σ, P), the set Ω is the set of all possible outcomes of a probability experiment. Mathematically, Ω is just a set, with elements ω. It is called the sample space. An event is the answer to a yes/no question. Equivalently, an event is a subset of the probability space: A Ω. Think of A as the set of outcomes where the answer is yes, and A c is the complementary set where the answer is no. A σ-algebra is a mathematical model of a state of partial knowledge about the outcome. Informally, if Σ is a σ-algebra and A Ω, we say that A Σ if we know whether ω A or not. 1

Definitions Algebra Definitions: Semiring A collection of sets F is called a semiring if it satisfies: Φ F. If A, B F, then A B F. If A, B F, then there exists a collection of sets C 1, C 2,..., C n F, such that A \ B =U (I >= 1) C i. (A\B all elements of A not in B) Definitions: Algebra A collection of sets F is called an algebra if it satisfies: Φ F. If ω 1 F, then ω C 1 F. (F is closed under complementation) If ω 1 F & ω 2 F, then ω 1 U ω 2 F. (F is closed under finite unions). Definitions: Sigma-algebra Definition: Sigma-algebra A sigma-algebra (σ-algebra or σ-field) F is a set of subsets ω of Ω s.t.: Φ F. If ω F, then ω C F. (ω C = complement of ω) If ω 1, ω 2,, ω n, F, then U (I >= 1) ω i F. Note that the set E = {{Φ},{a},{b},{c},{a,,b},{b,c},{a,c},{a,b,c}} is an algebra and a σ-algebra. Sigma algebras are a subset of algebras in the sense that all σ- algebras are algebras, but not vice versa. Algebras only require that they be closed under pairwise unions while σ-algebras must be closed under countably infinite unions. 2

Sigma-algebra Theorem: All σ-algebras are algebras, and all algebras are semi-rings. Thus, if we require a set to be a semiring, it is sufficient to show instead that it is a σ-algebra or algebra. Sigma algebras can be generated from arbitrary sets. This will be useful in developing the probability space. Theorem: For some set X, the intersection of all σ-algebras, A i, containing X -that is, x X =>x A i for all i-- is itself a σ-algebra, denoted σ(x). =>This is called the σ-algebra generated by X. Sample Space, Ω Definition: Sample Space The sample space Ω is the set of all possible unique outcomes of the experiment at hand. Example: If we roll a die, Ω = {1; 2; 3; 4; 5; 6}. In the probability space, the σ-algebra we use is σ (Ω), the σ-algebra generated by Ω. Thus, take the elements of Ω and generate the "extended set" consisting of all unions, compliments, compliments of unions, unions of compliments, etc. Include Φ; with this "extended set" and the result is σ (Ω), which we denote as Σ. Definition The σ-algebra generated by Ω, denoted Σ, is the collection of possible events from the experiment at hand. 3

(Ω, Σ) We have an experiment with Ω= {1, 2}. Then, Σ={{Φ},{1},{2},{1,2}}. Each of the elements of Σ is an event. Think of events as descriptions of experiment outcomes (Φ: the nothing occurs event). Note that σ-algebras can be defined over the real line as well as over abstract sets. To develop this notion, we need the concept of a topology. Definition: A topology τ on a set X is a collection of subsets of X satisfying: 1. Φ;X τ 2. τ is closed under finite intersections. 3. τ is closed under arbitrary unions. Any element of a topology is known as an open set. Borel σ-algebra Definition: Borel σ-algebra (Emile Borel (1871-1956), France.) The Borel σ-algebra (or, Borel field,) denoted B, of the topological space (X; τ) is the σ-algebra generated by the family τ of open sets. Its elements are called Borel sets. Lemma: Let C = {(a; b): a < b}. Then σ(c) = B R is the Borel field generated by the family of all open intervals C. What do elements of B R look like? Take all possible open intervals. Take their compliments. Take arbitrary unions. Include Φ and R. B R contains a wide range of intervals including open, closed, and half-open intervals. It also contains disjoint intervals such as {(2; 7] U (19; 32)}. It contains (nearly) every possible collection of intervals that are imagined. 4

Measures Definition: Measurable Space A pair (X,Σ) is a measurable space if X is a set and Σ is a nonempty σ- algebra of subsets of X. A measurable space allows us to define a function that assigns realnumbered values to the abstract elements of Σ. Definition: Measure μ Let (X,Σ) be a measurable space. A set function μ defined on Σ is called a measure iff it has the following properties. 1. 0 μ(a) for any A Σ. 2. μ(φ) = 0. 3. (σ-additivity). For any sequence of pairwise disjoint sets {A n } Σ such that U n=1 A n Σ, we have μ(u n=1 A n ) = Σ n=1 μ(a n ) Measures Intuition: A measure on a set, S, is a systematic way to assign a number to each suitable subset of that set, intuitively interpreted as its size. In this sense, it generalizes the concepts of length, area, volume. S Examples of measures: - Counting measure: μ(s) = number of elements in S. - Lebesgue measure on R: μ(s) = conventional length of S. That is, if S=[a,b] => μ(s) = λ[a,b] = b-a. 0 5

Measures & Measure Space Note: A measure μ may take as its value. Rules: (1) For any x R, + x =, x * = if x > 0, x * = if x < 0, and 0 * = 0; (2) + = ; (3) * a = for any a > 0; (4) or / are not defined. Definition: Measure Space A triplet (X,Σ,μ) is a measure space if (X,Σ) is a measurable space and μ: Σ [0; ) is a measure. If μ(x) = 1, then μ is a probability measure, which we usually use notation P, and the measure space is a probability space. Lebesgue Measure Lebesgue measure. There is a unique measure λ on (R, B R ) that satisfies λ([a, b]) = b a for every finite interval [a, b], < a b <. This is called the Lebesgue measure. If we restrict λ to the measurable space ([0, 1], B [0,1 ]), then λ is a probability measure. Examples: - Any Cartesian product of the intervals [a,b] x [c,d] is Lebesgue measurable, and its Lebesgue measure is λ=(b a)(d c). - λ([set of rational numbers in an interval of R]) = 0. Note: Not all sets are Lebesgue measurable. See Vitali sets. 6

Zero Measure Definition: Measure Zero A (μ-)measurable set E is said to have (μ-)measure zero if μ(e) = 0. Examples: The singleton points in R n, and lines and curves in R n, n 2. By countable additivity, any countable set in R n has measure zero. A particular property is said to hold almost everywhere if the set of points for which the property fails to hold is a set of measure zero. Example: a function vanishes almost everywhere ; f = g almost everywhere. Note: Integrating anything over a set of measure zero produces zero. Changing a function on a set of measure zero does not affect the value of its integral. Measure: Properties Let (Ω, Σ, μ) be a measure space. Then, μ has the following properties: (i) (Monotonicity). If A B, then μ(a) μ(b). (ii) (Subadditivity). For any sequence A 1,A 2,..., μ(u i=1 A i ) Σ i=1 μ(a i ) (iii) (Continuity). If A 1 A 2 A 3... (or A 1 A 2 A 3... and μ(a 1 ) < ), then μ(lim n A n ) = lim n μ(a n ), where lim n A n = U i=1 A i (or i=1 A i ) 7

Probability Space A measure space (Ω, Σ, μ) is called finite if μ(ω) is a finite real number (not ). A measure μ is called σ-finite if Ω can be decomposed into a countable union of measurable sets of finite measure. For example, the real numbers with the Lebesgue measure are σ-finite but not finite. Definition: Probability Space A measure space is a probability space if μ(ω)=1. In this case, μ is a probability measure, which we denote P. Let P be a probability measure. The cumulative distribution function (c.d.f.) of P is defined as: F(x) = P ((, x]), x R Product Space Definition: Product space Let Г i, i I, be sets, where I = {1,..., k}, k is finite or Define the product space as Π i I Г i = Г 1 Г k = {(a 1,..., a k ) : a i Г i, i I} Example: R R = R 2, R R R = R 3 Let (Ω i, Σ i ), i I, be measurable spaces Π i I Σ i is not necessarily a σ-field σ(π i I Σ i ) is called the product σ-field on the product space Π i I Ω i (Π i I Ω i, σ(π i I Σ i )) is denoted by Π i I (Ω i,σ i ) Example: Π i=1,...k (R, B) = (R k, B k ). 8

Product Measure Definition: Product measure Consider a rectangle [a 1,b 1 ] [a 2,b 2 ] R 2. The product measure of the usual area [a 1,b 1 ] [a 2,b 2 ]: (b 1 a 1 )(b 2 a 2 ) = μ([a 1, b 1 ]) μ([a 2, b 2 ]) Q: Is μ([a 1, b 1 ]) μ([a 2, b 2 ]) the same as the value of a measure defined on the product σ-field? A measure μ on (Ω, Σ) is said to be σ-finite if and only if there exists a sequence A 1,A 2,..., such that A i = Ω and μ(a i ) < for all i Any finite measure (such as a probability measure) is clearly σ-finite. The Lebesgue measure on R is σ-finite, since R = A n with A n = ( n, n), n = 1, 2,... The counting measure is σ-finite if and only if Ω is countable. Product Measure & Joint CDF Theorem: Product measure theorem Let (Ω i, Σ i, μ i ), i = 1,..., k, be measure spaces with σ-finite measures, where k 2 is an integer. Then there exists a unique σ-finite measure on the product σ-field σ(a 1... A k ), called the product measure and denoted by μ 1... μ k (Σ 1... Σ k ) = μ(a 1 )... μ(a k ) for all A i Σ i, i = 1,..., k. Let P be a probability measure on (R k, B k ). The c.d.f. (or joint c.d.f.) of P is defined by F(x 1,...,x k ) = P((, x 1 ]... (, x k ]), x i R There is a one-to-one correspondence between probability measures and joint c.d.f. s on R k. 9

Product Measure & Marginal CDF If F(x 1,...,x k ) is a joint c.d.f., then F i (x) = lim xj,j=1,...,i 1,i+1,...,k F(x 1,..,x k ) is a cdf and is called the i th marginal cdf. Marginal cdf s are determined by their joint c.d.f. But, a joint cdf cannot be determined by k marginal cdf s. If F(x 1,..,x k ) = F(x 1 )... F(x k ), then, the probability measure corresponding to F is the product measure P 1... P k with P i being the probability measure corresponding to F i Measurable Function Definition: Inverse function Let f be a function from Ω to Λ (often Λ = R k ) Let the inverse image of B Λ under f: f 1 (B) = {f B} = {ω Ω : f(ω) B}. Useful properties: f 1 (B c ) = (f 1 (B)) c for any B Λ ; f 1 ( B i ) = f 1 (B i ) ) for any B i Λ, i = 1, 2,... Note: The inverse function f 1 need not exist for f 1 (B) to be defined. Definition: Let (Ω, Σ) and (Λ, G) be measurable spaces and f a function from Ω to Λ. The function f is called a measurable function from (Ω, Σ) to (Λ, G) if and only if f 1 (G) Σ. 10

Measurable Function If f is measurable from (Ω,Σ) to (Λ,G) then f 1 (G) is a sub-σ-field of Σ. It is called the σ-field generated by f and is denoted by σ(f). If f is measurable from (Ω,Σ) to (R,B), it is called a Borel function or a random variable (RV). A random variable is a convenient way to express the elements of Ω as numbers rather than abstract elements of sets. Measurable Function Example: Indicator function for A Ω. I A (ω) = 1 if ω A = 0 if ω A C For any B R I 1 A (B) = if 0 not in B, 1 not in B = A if 0 not in B, 1 B = A c if 0 B, 1 not in B = Ω if 0 B, 1 B Then, σ(i A ) = {,A,A c, Ω} and I A is Borel if and only if A Σ. σ(f) is much simpler than Σ. Note: We express the elements of Ω as numbers rather than abstract elements of sets. 11

Measurable Function - Properties Theorems: Let (Ω, Σ) be a measurable space. (i) f is a RV if and only if f -1 (a, ) Σ for all a R. (ii) If f and g are RVs, then so are fg and af + bg, where a and b R; also, f/g is a RV provided g(ω) 0 for any ω Ω. (iii) If f 1, f 2,... are RVs, then so are sup n f n, inf n f n, limsup n f n, and liminf n f n. Furthermore, the set A = {ω Ω: lim n f n (ω) exists } is an event and the function h(ω) = lim n f n (ω) ω A = f 1 (ω) ω A c is a RV. Measurable Function - Properties Theorems: Let (Ω, Σ) be a measurable space. (iv) (Closed under composition) Suppose that f is measurable from (Ω, Σ) to (Λ, G) and g is measurable from (Λ, G) to (,H). Then, the composite function g f is measurable from (Ω, Σ) to (,H). (v) Let Ω be a Borel set in R p. If f is a continuous function from Ω to R p, then f is measurable. 12

Distribution Definition Let (Ω,Σ,μ) be a measure space and f be a measurable function from (Ω, Σ) to (Λ, G). The induced measure by f, denoted by μ f 1, is a measure on G defined as μ f 1 (B) = μ(f B) = μ( f 1 (B)), B G If μ = P is a probability measure and X is a random variable or a random vector, then P X 1 is called the distribution (or the law) of X and is denoted by P X. The cdf of P X is also called the cdf (or joint cdf) of X and is denoted by F X. Probability Space Definition and Axioms Kolmogorov's axioms Kolmogorov defined a list of axioms for a probability measure. Let P: E [0; 1] be our probability measure and E be some σ-algebra (events) generated by X. Axiom 1: P[A] 1 for all A E Axiom 2. P[X] = 1 Axiom 3. P[A 1 UA 2 U...UA n ] = P[A 1 ]+P[A 2 ]+:::+P[A n ], where {A 1 ;A 2 ;... ;A n } are disjoint sets in E. 13

Probability Space Properties of P The three Kolmogorov s basic axioms imply the following results: Theorem: P[A ic ] = 1 - P[A i ]. Theorem: P[Φ] = 0 Theorem: P[A i ] [0,1]. Theorem: P[B A C ] = P[B] - P[A B] Theorem: P[AUB] = P[A] + P[B] - P[A B] Theorem: A is in B => P[A] P[B] Theorem: A = B => P[A] = P[B] Theorem: P[A] = Σ i=1 P[A C i ] where {C 1 ; C 2 ;..} forms a partition of E. Theorem (Boole's Inequality, aka "Countable Subadditivity"): P[U i=1 A i ] Σ i=1 P[A i ] for any set of sets {A 1 ;A 2 ; ::} Probability Space - (Ω, Σ, P) Now, we have all the tools required to establish that (Ω, Σ, P) is a probability space. Theorem: Let Ω be the sample space of outcomes of an experiment, Σ be the σ- algebra of events generated from Ω, and P:Σ [0, ) be a probability measure that assigns a nonnegative real number to each event in Σ. The space (Ω, Σ, P) satisfies the definition of a probability space. Remark: The sample space is the list of all possible outcomes. Events are groupings of these outcomes. The σ-algebra Σ is the collection of all possible events. To each of these possible events, we assign some "size" using the probability measure P. 14

Probability Space Example: Consider the tossing of two fair coins. The sample space is {HH; HT; TH; TT}. Possible events: - The coins have different sides showing: {HT; TH}. - At least one head: {HH;HT; TH}. - First coin shows heads or second coin shows tails: {HH, HT, TT} The sigma algebra generated from the sample space is the collection of all possible such events: [Φ, {H,H}, {HT}, {TH}, {TT}, {HH,HT}, {HH,TH}, {HT,TH}, {TH,TT}, {HH,HT,TH}, {HH,HT,TT}, {HH,TH,TT}, {HT,TH,TT}, {HH,HT,TH,TT}] The probability measure P assigns a number from 0 to 1 to each of those events in the sigma algebra. Probability Space Example (continuation): The probability measure P assigns a number from 0 to 1 to each of those events in the sigma algebra. With a fair coin, we can assign the following probabilities to the events in the above sigma algebra: {0; 1/4; 1/4; 1/4; 1/4; 1/2; 1/2; 1/2; 1/2; 1/2; 1/2; 3/4; 3/4; 3/4; 3/4; 1} But, we do not have to use these fair values. We may believe the coin is biased so that heads appears 3/4 of the time. Then the following values for P would be appropriate: {0; 9/16; 3/16; 3/16; 1/16; 3/4; 3/4; 5/8; 3/8; 1/4; 1/4; 15/16; 13/16; 13/16; 7/16; 1} 15

Probability Space As long as the values of the probability measure are consistent with Kolmogorov's axioms and the consequences of those axioms, then we consider the probabilities to be mathematically acceptable, even if they are not reasonable for the given experiment. Philosophical comment: Can the probability values assigned be considered reasonable as long as they're mathematically acceptable? Random Variables Revisited A random variable is a convenient way to express the elements of Ω as numbers rather than abstract elements of sets. Definition: Measurable function Let A X ; A Y be nonempty families of subsets of X and Y, respectively. A function f: X Y is (A X ;A Y )-measurable if f -1 (A) є A X for all A є A Y. Definition: Random Variable A random variable X is a measurable function from the probability space (Ω, Σ,P) into the probability space (χ, A X, P X ), where χ in R is the range of X (which is a subset of the real line) A X is a Borel field of X, and P X is the probability measure on χ induced by X. Specifically, X: Ω χ,. 16