MTH 202 : Probability and Statistics

Similar documents
MTH 202 : Probability and Statistics

Lecture 11: Random Variables

Lecture 3: Random variables, distributions, and transformations

Chapter 5 Random vectors, Joint distributions. Lectures 18-23

Recap of Basic Probability Theory

Recap of Basic Probability Theory

Notes 1 : Measure-theoretic foundations I

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

1 Measurable Functions

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

1.1. MEASURES AND INTEGRALS

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

6.262: Discrete Stochastic Processes 2/2/11. Lecture 1: Introduction and Probability review

Abstract Measure Theory

RVs and their probability distributions

Connectedness. Proposition 2.2. The following are equivalent for a topological space (X, T ).

Probability and Measure

Probability Theory Review

Spring 2014 Advanced Probability Overview. Lecture Notes Set 1: Course Overview, σ-fields, and Measures

Sample Spaces, Random Variables

Measure and integration

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Random Variables. Definition: A random variable (r.v.) X on the probability space (Ω, F, P) is a mapping

Part V. 17 Introduction: What are measures and why measurable sets. Lebesgue Integration Theory

18.175: Lecture 2 Extension theorems, random variables, distributions

1 Sequences of events and their limits

ADVANCED CALCULUS - MTH433 LECTURE 4 - FINITE AND INFINITE SETS

Statistics and Econometrics I

MORE ON CONTINUOUS FUNCTIONS AND SETS

Math 564 Homework 1. Solutions.

STAT 712 MATHEMATICAL STATISTICS I

Basics of probability

MATH 409 Advanced Calculus I Lecture 7: Monotone sequences. The Bolzano-Weierstrass theorem.

Fundamental Tools - Probability Theory II

Introduction to probability theory

Lecture 1: Overview of percolation and foundational results from probability theory 30th July, 2nd August and 6th August 2007

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Continuum Probability and Sets of Measure Zero

An Outline of Some Basic Theorems on Infinite Series

MATH 3300 Test 1. Name: Student Id:

F (x) = P [X x[. DF1 F is nondecreasing. DF2 F is right-continuous

Lecture 3. Discrete Random Variables

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

Random variables. DS GA 1002 Probability and Statistics for Data Science.

4 Expectation & the Lebesgue Theorems

Statistics for Financial Engineering Session 2: Basic Set Theory March 19 th, 2006

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Almost Sure Convergence of a Sequence of Random Variables

Lecture 1: Probability Fundamentals

1 Stat 605. Homework I. Due Feb. 1, 2011

Notes on Measure, Probability and Stochastic Processes. João Lopes Dias

An Application of First-Order Logic to a Problem in Combinatorics 1

Lecture 6 Basic Probability

X n D X lim n F n (x) = F (x) for all x C F. lim n F n(u) = F (u) for all u C F. (2)

More on Distribution Function

Section 21. The Metric Topology (Continued)

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Probability. Lecture Notes. Adolfo J. Rumbos

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

1: PROBABILITY REVIEW

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Chapter 1: Probability Theory Lecture 1: Measure space and measurable function

Northwestern University Department of Electrical Engineering and Computer Science

Lecture 9: Conditional Probability and Independence

3 rd class Mech. Eng. Dept. hamdiahmed.weebly.com Fourier Series

2. AXIOMATIC PROBABILITY

A PECULIAR COIN-TOSSING MODEL

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, < x <. ( ) X s. Real Line

Chapter 4. Measure Theory. 1. Measure Spaces

p. 4-1 Random Variables

Lecture 8: Probability

Probability Theory and Simulation Methods

3 Integration and Expectation

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

p. 6-1 Continuous Random Variables p. 6-2

Some Background Material

Measures and Measure Spaces

Chapter 4 : Discrete Random Variables

Probability. Carlo Tomasi Duke University

Cardinality and ordinal numbers

LECTURE 1. 1 Introduction. 1.1 Sample spaces and events

Chapter 3. Topology of the Real Numbers.

Course 212: Academic Year Section 1: Metric Spaces

Topology, Math 581, Fall 2017 last updated: November 24, Topology 1, Math 581, Fall 2017: Notes and homework Krzysztof Chris Ciesielski

Review of Probability Theory

Statistical Inference

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

the time it takes until a radioactive substance undergoes a decay

System Identification

Continuous Probability Spaces

X 1 ((, a]) = {ω Ω : X(ω) a} F, which leads us to the following definition:

3 Measurable Functions

Chapter 1 Probability Theory

Appendix A : Introduction to Probability and stochastic processes

RANDOM WALKS AND THE PROBABILITY OF RETURNING HOME

convergence theorem in abstract set up. Our proof produces a positive integrable function required unlike other known

The Lebesgue Integral

MAT 570 REAL ANALYSIS LECTURE NOTES. Contents. 1. Sets Functions Countability Axiom of choice Equivalence relations 9

Lecture Notes 2 Random Variables. Discrete Random Variables: Probability mass function (pmf)

Measure Theoretic Probability. P.J.C. Spreij

Transcription:

MTH 202 : Probability and Statistics Lecture 5-8 : 15, 20, 21, 23 January, 2013 Random Variables and their Probability Distributions 3.1 : Random Variables Often while we need to deal with probability of certain events, we need to take care of certain arithmetic operations on the sets which turn out to be tricky. Rather it is easier to understand deal with real valued functions defined on the points of the sample space Ω. We call these as random variables. Before we get into these we would need to know about a specific σ-field built on the open intervals in R, known as the Borel σ-field. Theorem 3.1.1 : Let X be a non-empty set and P(X) denote its power set. For any collection C P(X) of subsets of X, there exists a smallest σ-field of subsets of X which contain all subsets of X from C. Proof : There is at least one σ-field which contain C itself, namely P(X). Now consider to collect all such σ-field : T := {U : C U}; the collection is non-empty since P(X) is a member of it. Now the intersection U T U fulfill the defining properties of a σ-field (Verify!) and this also contain C. Note 3.1.2 : The σ-field constructed above is often denoted by C and called as the one generated by C. In practical it is hard to describe the members of C while a collection of subsets C P(X) is given. Hence for practical purposes, we will be needing a necessary condition to avoid dealing with sets from C. Definition 3.1.3 : Let C be the collection of all open intervals of the form (a, b) where a, b R. Then the smallest σ-field C (in R), often denoted as B(R) generated by C is called the Borel σ-field. The sets from B(R) are called Borel sets. 1

2 Exercise 3.1.4 : B(R) can also be described to be generated by the following collections : (i){(, x] : x R}, (ii){[x, ) : x R}, (iii){(, x) : x R}, (iv){(x, ) : x R}, (v){[x, y] : x, y R}, (vi){[x, y) : x, y R}, (vii){(x, y] : x, y R}. We are now ready to encounter random variables : Definition 3.1.5 : Let (Ω, S, P ) be a probability space. A function F : Ω R is called a random variable (RV in short) if : for every Borel set B R. F 1 (B) = {ω Ω : F (ω) B} S It is customary to denote a random variable by X instead of the usual function notation F. From the previous exercise we can deduce that : Theorem 3.1.6 : Let (Ω, S, P ) be a probability space. Then X is a random variable if and only if for all x R. X 1 ((, x]) = {ω Ω : X(ω) x} S Example 3.1.7 : Suppose that we toss a coin thrice and count the number of heads turn up in each outcome. Here the sample space Ω is {abc : a, b, c {H, T }}. Let us define the probability on Ω by P ({abc}) := 1/8 for all abc Ω. Suppose we are speaking of the event A = {at least two H s} and calculate P A. At this moment instead of talking about the set A, we can simply introduce a random variable X : Ω R by : X(ω) := number of H s in ω For example X(HT H) = 2 = X(HHT ), X(HHH) = 3 etc. The function X is a random variable simply because S is all of the power set P(Ω). Finally we would write P A by P (X 2) and calculate that P A = 1/2 = P X 1 ([2, )). Exercise 3.1.8 : Let X be an RV. Is X also an RV? If X is an RV that takes only nonnegative values, is X also an RV? Solution : Let U x be the set U x := X 1 ((, x]) = {ω Ω : X(ω) x}

Then if x < 0, U x = X 1 (0) if x = 0, X 1 [ x, x] if x > 0. Clearly these sets are in S, since X is a RV. Next we recall that for x 0 in R the symbol x denote the positive square root of x. Let V x be the set V x := X 1 ((, x]) = {ω Ω : < X(ω) x} Then V x = if x < 0. Now if x 0 we have Hence X is also an RV. V x = {ω : 0 X(ω) x} = X 1 ([0, x 2 ]) Exercise 3.1.9 : Let Ω = [0, 1] and S be the Borel σ-field of subsets of Ω. Define X : Ω R by : { ω if 0 ω 1 X(ω) =, 2 ω 1 if 1 < ω 1. 2 2 Is X an RV? If so, what is the event {ω : X(ω) ( 1, 1)}? 4 2 Solution : We notice that if x < 0, X 1 (, x] = [0, x] ( 1 2, 1 + x] if 0 x < 1, 2 2 [0, 1] if x 1. 2 2 3 3.2 : Probability distribution of a Random Variable Theorem 3.2.1 : The RV X defined on a probability space (Ω, S, P ) induces a probability space (R, B(R), Q) defined by Q(B) := P (X 1 (B)) = P ({ω : X(ω) B}) for all B B(R). Proof : Ref. Pg 43, Sec. 2.3, Theorem 1 [RS]. Before we speak about the probability distribution, we would first define the idea of a distribution function in general.

4 Definition 3.2.2 : A function F : R R is called a distribution function if : (i) x < y implies F (x) F (y) for all x, y R (non-decreasing), (ii) lim x a +F (x) = F (a) for all x R (right continuous), (iii) F ( ) = 0 (i.e., lim x F (x) = 0) and F (+ ) = 1 (i.e., lim x + F (x) = 1) Exercise 3.2.3 : Do the following functions define DF s? 0 if x < 0, (a) F (x) = x if 0 x < 1 2, and 1 if x 1 2 (b) F (x) = 1 π tan 1 x (x R) Solution : (a) The property (i) can be easily checked. Since the function is defined by the patches of continuos functions, we would need to verify (ii) at x = 0 and x = 1. 2 Now we see that, lim x 0 +F (x) = lim x 0 +x = 0 = F (0). Similarly at x = 1 we have lim +F (x) = 1 = F ( 1). 2 x 1 2 2 The third property is clear since the F merges with the constant functions 0 and 1 near and + respectively. 1 (b) The limit of F (x) lim x π tan 1 x = 1 0. 2 Similarly, F (+ ) = 1. Hence this is not a distribution function. 2 Theorem 3.2.4 : The set of points where a DF F is discontinuous is at the most countable. Proof : Ref. Pg 44, Sec. 2.3, Theorem 2 [RS]. We would now define the DF of an RV. Definition 3.2.5 : Let X be an RV defined on a probability space (Ω, S, P ). The function F : R R defined by F (x) = Q(, x] = P ({ω ω : X(ω) x}) (x R) is called the distribution function of the RV X. The name distribution function of an RV is surely given for some reason : Theorem 3.2.6 : The function F defined as above is a DF. Proof : Ref. Pg 45, Sec. 2.3, Theorem 3 [RS].

In fact every DF can be shown to be a DF of an RV on some probability space. The proof of this would not be discussed in this course. From now on we would adopt the following notations : P ({ω ω : X(ω) α}) is denoted by P (X α), P ({ω ω : X(ω) < α}) is denoted by P (X < α) etc. Exercise 3.2.7 : Do the following function define a DF? If so, find P ( < X < 2). { 1 e x if x 0, F (x) = 0 if x < 0. Solution : F (x) = e x > 0 shows that the function is strictly increasing on the positive half of the real line. It is constant on the negative side and 0 < 1 e x for all x > 0. Hence F is non-decreasing. F (x) is continuous while x 0, which implies F is right continuous at x = 0. At any other point F is indeed continuous. Finally F is the constant function 0 while being on the negative side 1 of the real line showing F ( ) = 0. Since lim x + = 0, we have e x that F (+ ) = 1. Thus F is a DF. Since F is a continuous function, P (X = a) = 0 for all a R (Why?). P ( < X < 2) = P ( < X 2) P (X = 2) = F (2) 0 = 1 e 2 5 3.3 : Discrete and Continuous Random Variables There would essentially be two distinct type of RV s we would be dealing with. The first we will be discussing about discrete RV s. Roughly speaking, the discrete RV is the one for which the complete probability mass would be concentrated at some discrete points (i.e., points which are separated from each other by certain positive distance). First, we would briefly recall the notion of countable set. Definition 3.3.1 : A set E is said to be countable if it is either finite, or else there is a bijection f : N E. The set E is finite meaning if you along with the some others are trying to count the elements of E by numbers 1, 2, 3,..., it would theoretically stop at some point, doesn t matter even if the sun is extinct by then,

6 or else the earth is evacuated by rest of the humans while no one could have changed your interest in counting E. Figure 1. WALL-E and EVA On the other hand, a countably infinite set is impossible to be counted by any finite time given. However like the previous case, say while counting by the numbers 1, 2, 3,... you also put a tag on the elements by these numbers. Thus we would be calling E countably infinite if every element would have a number tag n, however large it could be. In the previous definition the bijection f would ensure that the tags, say n given to the element f(n) are all distinct. Definition 3.3.2 : An RV defined on a probability space (Ω, S, P ) is said to be of discrete type (or simply discrete) if there is a countable set E R such that P (X E) = 1. A relevant query here at this point would be whether countable sets in R are Borel sets, else it would be meaningless to talk about P (X E) = P X 1 (E). First we note that every singleton sets {x} in R are Borel sets by means of the infinite nested intersection : ( {x} = x 1 n, x + 1 ) n n=1 Thus countable subsets of R is a Borel set, since they would be countable union (either finite or infinite) of finite sets. Now if it is known that P (X = x i ) = p i 0 for all x i E, we have from the definition of probability that p n = 1 n=1

7 Definition 3.3.3 : The collection of non-negative real numbers {p i } i=1 satisfying P (X = x i ) = p i for all i N and i=1 p i = 1, is called the probability mass function (PMF) of the RV X. The DF F of X is given by : F (x) = P (X x) = x i x p i (x R) The name probability mass function for the expression {p i } i=1 of non-negative real numbers may be misleading. In fact it can precisely be written as a function p : R R by { p k if x = x k (k = 1, 2,... ) p(x) = 0 otherwise In general : Definition 3.3.4 : Let {p i } i=1 be a collection of non-negative real numbers such that i=1 p i = 1. Then {p i } i=1 is the PMF of some RV X. Exercise 3.3.5 : For what value of K do the following define the probability mass function of some random variable : f(x) = K/N (x = 1, 2,..., N) Solution : We need N i=1 K/N = K = 1. Next we would consider the RV s associated to the DF s which are of continuous type. Definition 3.3.6 : Let X be an RV defined on a probability space (Ω, S, P ) with DF F. Then X is said to be of continuous type if there is an integrable function f : R [0, ) such that F (x) = x f(t)dt (x R) The function f is called the probability density function (PDF) of the RV X. Properties 3.3.7 : Let f be the PDF of the RV X on the probability space (Ω, S, P ). Then : (i) In general : f(t)dt = 1, (ii) P (a < X b) = b a f(t)dt

8 Theorem 3.3.8 : Every non-negative real function f that is integrable over R and satisfies f(t)dt = 1 is the PDF of some continuous RV X. As a special note, we would address a few comments regarding continuity of the distribution function. Theorem 3.3.10 : Let F be the distribution function corresponding to an RV X over the probability space (Ω, S, P ). If F is continuous at x = a, then P (X = a) = 0. Otherwise P (X = a) = F (a) F (a ) > 0 Proof : Consider the sequence of event sets E n := {ω Ω : a 1 n < X(ω) a} = X 1 ((a 1 n, a]) Since (a 1 n, a] is a Borel set, E n S for all n N. But we see that E 1 E 2 E 3... i.e., the sequence {E n } n=1 is decreasing and we have ( E n = X 1 (a 1 ) n, a] = X 1 ({a}) n=1 n=1 Since {E n } n=1 is decreasing we have (See corollary to Thm.6, Pg-13, [RS]) lim P (E n) = P ( n=1e n ) = P (X 1 ({a})) = P (X = a) n But P (E n ) = P (a 1 < X a) = F (a) F (a 1 ). Hence n n lim P (E n) = F (a) lim F (a 1 ) = F (a) F (a ) n n n Now if F is continuous at x = a, it is left continuous there as well. Hence, F (a) = F (a ). Next if F is not continuous at x = a, since F is increasing we have F (a 1 n ) < F (a) for all n N. Thus {P (E n)} n=1 is a sequence of positive real numbers whose limit exists (since F is non-decreasing), but not 0. Hence the limit F (a) F (a ) > 0. We will finally note that if X is of continuous type, then F has a derivative almost everywhere, which is an equivalent to say that F is absolutely continuous, a notion which is much stronger than continuity.

For details, you may consult Chap-5, Section 4, Cor. 12, [ROY]. In short, we have the following conclusion : Corollary 3.3.11 : Let F be the distribution function corresponding to an RV X of continuous type over the probability space (Ω, S, P ). Then P (X = a) = 0 for all a R. In particular, F is a continuous function. Moreover, there are RV s whose types are neither continuous, nor discrete. Hence the DF s for these would not be absolutely continuous. However, these might have the corresponding density (or probability) function which would be a little tricky to describe. For example : Example 3.3.12 : Is the following function a DF? If so, find the corresponding density or probability function : 0 if x < 1, (x 1) F (x) = 2 if 1 x < 3, 8 1 if x 3 Proof : Except at the interval [1, 3), the function F is constant. In the open interval (1, 3), we have F (x) = (x 1)/4 > 0. Hence F is nondecreasing. Clearly, F ( ) = 0, F (+ ) = 1. Finally, F is defined piecewise by the functions which are always right continuous, implying F is right continuous. Therefore, F is a DF. The corresponding density function f is given by 0 if x < 1, (x 1) f(x) = if 1 x < 3, 4 0 if x 3 We note that F is not continuous at x = 3. In fact, P (X = 3) = F (3) F (3 ) = 1 1 2 = 1 2 > 0. References : [ROY] Real Analysis, H.L. Royden, 3rd Edition, Macmillan Publishing Co. [RS] An Introduction to Probability and Statistics, V.K. Rohatgi and A.K. Saleh, Second Edition, Wiley Students Edition. 9