MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES

Similar documents
Northwestern University Department of Electrical Engineering and Computer Science

Random variables. DS GA 1002 Probability and Statistics for Data Science.

SDS 321: Introduction to Probability and Statistics

2 Functions of random variables

Review of Probability Theory

Continuous Random Variables

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

6.041/6.431 Fall 2010 Quiz 2 Solutions

2 Continuous Random Variables and their Distributions

SDS 321: Introduction to Probability and Statistics

STAT Chapter 5 Continuous Distributions

STAT2201. Analysis of Engineering & Scientific Data. Unit 3

Probability and Distributions

Lecture 7. Sums of random variables

1 Random Variable: Topics

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

PCMI Introduction to Random Matrix Theory Handout # REVIEW OF PROBABILITY THEORY. Chapter 1 - Events and Their Probabilities

Chapter 4. Chapter 4 sections

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Chapter 3: Random Variables 1

Basics of Stochastic Modeling: Part II

Chapter 4. Continuous Random Variables 4.1 PDF

Conditioning a random variable on an event

Chapter 5. Random Variables (Continuous Case) 5.1 Basic definitions

Check Your Understanding of the Lecture Material Finger Exercises with Solutions

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

18.440: Lecture 28 Lectures Review

Fundamental Tools - Probability Theory II

Lecture 13: Conditional Distributions and Joint Continuity Conditional Probability for Discrete Random Variables

DS-GA 1002 Lecture notes 2 Fall Random variables

Lecture 3: Random variables, distributions, and transformations

STAT/MATH 395 PROBABILITY II

Mathematics 426 Robert Gross Homework 9 Answers

1: PROBABILITY REVIEW

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

Lecture Notes 3 Multiple Random Variables. Joint, Marginal, and Conditional pmfs. Bayes Rule and Independence for pmfs

Continuous Random Variables and Continuous Distributions

Continuous distributions

Chapter 3: Random Variables 1

Chapter 2: Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

I. ANALYSIS; PROBABILITY

Probability Models. 4. What is the definition of the expectation of a discrete random variable?

Expectation, Variance and Standard Deviation for Continuous Random Variables Class 6, Jeremy Orloff and Jonathan Bloom

Chapter 6 Expectation and Conditional Expectation. Lectures Definition 6.1. Two random variables defined on a probability space are said to be

Probability Review. Yutian Li. January 18, Stanford University. Yutian Li (Stanford University) Probability Review January 18, / 27

2. Variance and Covariance: We will now derive some classic properties of variance and covariance. Assume real-valued random variables X and Y.

Probability (continued)

Sample Spaces, Random Variables

1.1 Review of Probability Theory

Chapter 3 sections. SKIP: 3.10 Markov Chains. SKIP: pages Chapter 3 - continued

HW7 Solutions. f(x) = 0 otherwise. 0 otherwise. The density function looks like this: = 20 if x [10, 90) if x [90, 100]

Lecture 6 Basic Probability

If g is also continuous and strictly increasing on J, we may apply the strictly increasing inverse function g 1 to this inequality to get

Chapter 2 Random Variables

More on Distribution Function

The Double Sum as an Iterated Integral

3 Continuous Random Variables

Introduction to Probability. Ariel Yadin. Lecture 10. Proposition Let X be a discrete random variable, with range R and density f X.

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Chapter 4: Continuous Probability Distributions

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 3 9/10/2008 CONDITIONING AND INDEPENDENCE

Bivariate Distributions

MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems

Lecture 11: Random Variables

18.440: Lecture 28 Lectures Review

Stat 426 : Homework 1.

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

STAT509: Continuous Random Variable

CS145: Probability & Computing

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 18

General Random Variables

1 Presessional Probability

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 11 10/15/2008 ABSTRACT INTEGRATION I

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

MTH 202 : Probability and Statistics

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 16. A Brief Introduction to Continuous Probability

Lecture Notes 2 Random Variables. Random Variable

1 Review of di erential calculus

Math Spring Practice for the Second Exam.

ECE 302 Division 2 Exam 2 Solutions, 11/4/2009.

Conditional densities, mass functions, and expectations

Random variable X is a mapping that maps each outcome s in the sample space to a unique real number x, x. X s. Real Line

Measure theory and stochastic processes TA Session Problems No. 3

STAT 418: Probability and Stochastic Processes

Expectation of Random Variables

Chapter 5 Random vectors, Joint distributions. Lectures 18-23

STAT 414: Introduction to Probability Theory

HW4 : Bivariate Distributions (1) Solutions

Conditional distributions (discrete case)

F X (x) = P [X x] = x f X (t)dt. 42 Lebesgue-a.e, to be exact 43 More specifically, if g = f Lebesgue-a.e., then g is also a pdf for X.

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

Closed book and notes. 60 minutes. Cover page and four pages of exam. No calculators.

1 Joint and marginal distributions

Continuous Random Variables

Random Variables (Continuous Case)

IE 230 Probability & Statistics in Engineering I. Closed book and notes. 60 minutes.

Lecture 2: Repetition of probability theory and statistics

p. 6-1 Continuous Random Variables p. 6-2

Transcription:

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.436J/15.085J Fall 2008 Lecture 8 10/1/2008 CONTINUOUS RANDOM VARIABLES Contents 1. Continuous random variables 2. Examples 3. Expected values 4. Joint distributions 5. Independence Readings: For a less technical version of this material, but with more discussion and examples, see Sections 3.1-3.5 of [BT] and Sections 4.1-4.5 of [GS]. 1 CONTINUOUS RANDOM VARIABLES Recall 1 that a random variable X : Ω R is said to be continuous if its CDF can be written in the form x P(X x) = F X (x) = f X (t)dt, for some nonnegative measurable function f : R [0, ), which is called the PDF of X. We then have, for any Borel set B, P(X B) = f X (x) dx. Technical remark: Since we have not yet defined the notion of an integral of a measurable function, the discussion in these notes will be rigorous only when we deal with integrals that can be interpreted in the usual sense of calculus, 1 The reader should revisit Section 4 of the notes for Lecture 5. B 1

namely, Riemann integrals. For now, let us just concentrate on functions that are piecewise continuous, with a finite number of discontinuities. We note that f X should be more appropriately called a (as opposed to the ) PDF of X, because it is not unique. For example, if we modify f X at a finite number of points, its integral is unaffected, so multiple densities can correspond to the same CDF. It turns out, however, that any two densities associated with the same CDF are equal except on a set of Lebesgue measure zero. A PDF is in some ways similar to a PMF, except that the value f X (x) cannot be interpreted as a probability. In particular, the value of f X (x) is allowed to be greater than one for some x. Instead, the proper intuitive interpretation is the fact that if f X is continuous over a small interval [x, x + δ], then P(x X x + δ) f X (x)δ. Remark: The fact that a random variable X is continuous has no bearing on the continuity of X as a function from Ω into R. In fact, we have not even defined what it means for a function on Ω to be continuous. But even in the special case where Ω = R, we can have a discontinuous function X : R R which is a continuous random variable. Here is an example. Let the underlying probability measure on Ω be the Lebesgue measure on the unit interval. Let { ω, 0 ω 1/2, X(ω) = 1 + ω, 1/2 < ω 1. The function X is discontinuous. The random variable X takes values in the set [0, 1/2] (3/2, 2]. Furthermore, it is not hard to check that X is a continuous random variable with PDF given by { 1, x [0, 1/2] (3/2, 2] f X (x) = 0 otherwise. 2 EXAMPLES We present here a number of important examples of continuous random variables. 2

2.1 Uniform This is perhaps the simplest continuous random variable. Consider an interval [a, b], and let 0, x a, F X (x) = (x a)/(b a), a < x b, 1, x > b. It is easy to check that F X satisfies the required properties of CDFs. We denote this distribution by U(a, b). We find that a corresponding PDF is given 1 by f X (x) = (df X /dx)(x) = b a for x [a, b], and f X (x) = 0, otherwise. When [a, b] = [0, 1], the probability law of a uniform random variable is just the Lebesgue measure on [0, 1]. 2.2 Exponential Fix some λ > 0. Let F X (x) = 1 e λx, for x 0, and F X (x) = 0, for x < 0. It is easy to check that F X satisfies the required properties of CDFs. A corresponding PDF is f X (x) = λe λx, for x 0, and f X (x) = 0, for x < 0. We denote this distribution by Exp(λ). The exponential distribution can be viewed as a limit of a geometric distribution. Indeed, if we fix some δ and consider the values of F X (kδ), for k = 1, 2,..., these values agree with the values of a geometric CDF. Intuitively, the exponential distribution corresponds to a limit of a situation where every δ time units, we toss a coin whose success probability is λδ, and let X be the time elapsed until the first success. The distribution Exp(λ) has the following very important memorylessness property. Theorem 1. Let X be an exponentially distributed random variable. Then, for every x, t 0, we have P(X > x + t X > x) = P(X > t). Proof: Let X be exponential with parameter λ. We have P(X > x + t X > x) = = P(X > x + t, X > x) P(X > x + t) = P(X > x) P(X > x) e λ(x+t) e λx = e λt = P(X > t). Exponential random variables are often used to model memoryless arrival processes, in which the elapsed waiting time does not affect our probabilistic 3

model of the remaining time until an arrival. For example, suppose that the time until the next bus arrival is an exponential random variable with parameter λ = 1/5 (in minutes). Thus, there is probability e 1 that you will have to wait for at least 5 minutes. Suppose that you have already waited for 10 minutes. The probability that you will have to wait for at least another five minutes is still the same, e 1. 2.3 Normal distribution Perhaps the most widely used distribution is the normal (or Gaussian) distribution. It involves parameters µ R and σ > 0, and the density f X (x) = σ 1 2π e (x 2 µ ) 2 σ2. It can be checked that this is a legitimate PDF, i.e., that it integrates to one. Note also that this PDF is symmetric around x = µ. We use the notation N(µ, σ 2 ) to denote the normal distribution with parameters µ, σ. The distribution N(0, 1) is referred to as the standard normal distribution; a corresponding random variable is also said to be standard normal. There is no closed form formula for the corresponding CDF, but numerical tables are available. These tables can also be used to find probabilities associated with general normal variables. This is because of the fact (to be verified later) that if X N(µ, σ 2 ), then (X µ)/σ N(0, 1). Thus, ( X µ c µ ) P(X c) = P = Φ((c µ)/σ), σ σ where Φ is the CDF of the standard normal, available from the normal tables. 2.4 Cauchy distribution Here, f X (x) = 1/(π(1 + x 2 )), x R. It is an exercise in calculus to show that f(t)dt = 1, so that f X is indeed a PDF. The corresponding distribution is called a Cauchy distribution. 2.5 Power law We have already defined discrete power law distributions. We present here a continuous analog. Our starting point is to introduce tail probabilities that decay according to power law: P(X > x) = β/x α, for x c > 0, for some parameters α, c > 0. In this case, the CDF is given by F X (x) = 1 β/x α, x c, and 4

F X (x) = 0, otherwise. In order for X to be a continuous random variable, F X cannot have a jump at x = c, and we therefore need β = c α. The corresponding density is of the form 3 EXPECTED VALUES df X αc α f X (t) = (t) =. dx t α+1 Similar to the discrete case, given a continuous random variable X with PDF f X, we define E[X] = xf X (x) dx. This integral is well defined and finite if x f X (x) dx <, in which case we say that the random variable X is integrable. The integral is also well de 0 fined, but infinite, if one, but not both, of the integrals xf X (x) dx and 0 xf X (x) dx is infinite. If both of these integrals are infinite, the expected value is not defined. Practically all of the results developed for discrete random variables carry over to the continuous case. Many of them, e.g., E[X + Y ] = E[X] + E[Y ], have the exact same form. We list below two results in which sums need to be replaced by integrals. Proposition 1. Let X be a nonnegative random variable, i.e., P(X < 0) = 0. Then E[X] = (1 F X (t)) dt. 0 Proof: We have (1 F X (t)) dt = P(X > t) dt = f X (x) dx dt 0 0 x 0 t = f X (x) dt dx = xf X (x) dx = E[X]. 0 0 0 (The interchange of the order of integration turns out to be justified becuase the integrand is nonnegative.) 5

Proposition 2. Let X be a continuous random variable with density f X, and suppose that g : R R is a (Borel) measurable function such that g(x) is integrable. Then, E[g(X)] = g(t)f X (t) dt. Proof: Let us express the function g as the difference of two nonnegative functions, g(x) = g + (x) g (x), where g + (x) = max{g(x), 0}, and g (x) = max{ g(x), 0}. In particular, for any t 0, we have g(x) > t if and only if g + (x) > t. We will use the result [ ] ( ) ( ) E g(x) = P g(x) > t dt P g(x) < t dt 0 0 from Proposition 1. The first term in the right-hand side is equal to f X (x) dx dt = f X (x) dt dx = g + (x)f X (x) dx. 0 {x g(x)>t} {t 0 t<g(x)} By a symmetrical argument, the second term in the right-hand side is given by ( ) P g(x) < t dt = g (x)f X (x) dx. 0 Combining the above equalities, we obtain [ ] E g(x) = g + (x)f X (x) dx g (x)f X (x) dx = g(x)f X (x) dx. Note that for this result to hold, the random variable g(x) need not be continuous. The proof is similar to the one for Proposition 1, and involves an interchange of the order of integration; see [GS] for a proof for the special case where g 0. 4 JOINT DISTRIBUTIONS Given a pair of random variables X and Y, defined on the same probability space, we say that they are jointly continuous if there exists a measurable 2 2 Measurability here means that for every Borel subset of R, the set f 1 (B) is a Borel subset of R 2. The Borel σ-field in R 2 is the one generated by sets of the form [a, b] [c, d]. 6

f X,Y : R 2 [0, ) such that their joint CDF satisfies x y F X,Y (x, y) = P(X x, Y y) = f X,Y (u, v) du dv. The function f X,Y is called the joint CDF of X and Y. At those points where the joint PDF is continuous, we have 2 F (x, y) = fx,y (x, y). x y Similar to what was mentioned for the case of a single random variable, for any Borel subset B of R 2, we have ( ) P (X, Y ) B = f X,Y (x, y) dx dy = I B (x, y)f X,Y (x, y) dx dy. B R 2 Furthermore, if B has Lebesgue measure zero, then P(B) = 0. 3 We observe that x P(X x) = f X,Y (u, v) du dv. Thus, X itself is a continuous random variable, with marginal PDF f X (x) = f X,Y (x, y) dy. We have just argued that if X and Y are jointly continuous, then X (and, similarly, Y ) is a continuous random variables. The converse is not true. For a trivial counterexample, let X be a continuous random variable, and let and Y = X. Then the set {(x, y) R 2 x = y} has zero area (zero Lebesgue measure), but unit probability, which is impossible for jointly continuous random variables. In particular, the corresponding probability law on R 2 is neither discrete nor continuous, hence qualifies as singular. Proposition 2 has a natural extension to the case of multiple random variables. 3 The Lebesgue measure on R 2 is the unique measure µ defined on the Borel subsets of R 2 that satisfies µ([a, b] [c, d]) = (b a)(d c), i.e., agrees with the elementary notion of area on rectangular sets. Existence and uniqueness of such a measure is obtained from the Extension Theorem, in a manner similar to the one used in our construction of the Lebesgue measure on R. 7

Proposition 3. Let X and Y be jointly continuous with PDF f X,Y, and suppose that g : R 2 R is a (Borel) measurable function such that g(x) is integrable. Then, E[g(X, Y )] = g(u, v)f X,Y (u, v) du dv. 5 INDEPENDENCE Recall that two random variables, X and Y, are said to be independent if for any two Borel subsets, B 1 and B 2, of the real line, we have P(X B 1, Y B 2 ) = P(X B 1 )P(Y B 2 ). Similar to the discrete case (cf. Proposition 1 and Theorem 1 in Section 3 of Lecture 6), simpler criteria for independence are available. Theorem 2. Let X and Y be jointly continuous random variables defined on the same probability space. The following are equivalent. (a) The random variables X and Y are independent. (b) For any x, y R, the events {X x} and {Y y} are independent. (c) For any x, y R, we have F X,Y (x, y) = F X (x)f Y (y). (d) If f X, f Y, and f X,Y are corresponding marginal and joint densities, then f X,Y (x, y) = f X (x)f Y (y), for all (x, y) except possibly on a set that has Lebesgue measure zero. The proof parallels the proofs in Lecture 6, except for the last condition, for which the argument is simple when the densities are continuous functions (simply differentiate the CDF), but requires more care otherwise. 8

MIT OpenCourseWare http://ocw.mit.edu 6.436J / 15.085J Fundamentals of Probability Fall 2008 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.