Chapter 4: Continuous Random Variables and Probability Distributions

Chapter 4: and Probability Distributions Walid Sharabati Purdue University February 14, 2014 Professor Sharabati (Purdue University) Spring 2014 (Slide 1 of 37)

Chapter Overview Continuous random variables Probability density function (pdf) Definition and interpretation Cumulative distribution function (cdf) Definition and interpretation Relationship between cdf and pdf Expectation, variance and percentile for continuous rv Some continuous distributions Uniform and exponential The Normal distribution Using normal table Approximateing the Bionomial distribution Professor Sharabati (Purdue University) Spring 2014 (Slide 2 of 37)

Continuous rv and the Probability Density Function Continuous random variables Definitions Examples Probability density functions (pdf) Definitions Interpretations Examples Professor Sharabati (Purdue University) Spring 2014 (Slide 3 of 37)

Continuous rv Definition A random variable X is said to be continuous if its set of possible value includes an entire interval of numbers on the real line. Example Make depth measurements at a randomly selected location in a specific lake. Let X = the depth at this location. X can be any value between 0 and maximum depth M. Example A chemical compound is randomly selected and let X = the ph value. X can be any value between 0 and 14. Professor Sharabati (Purdue University) Spring 2014 (Slide 4 of 37)

Probability Density Function (PDF) Definition Let X be a continuous rv. Then a probability distribution or probability density function (pdf) of X is a function f(x) such that for any two numbers a and b with a b, P (a X b) = The graph of f is the density curve. b a f(x)dx. i.e., the probability that X falls in [a, b] is the area under the function f(x) above this interval. f(x) must satisfies the following: 1 f(x) 0 for all x. 2 f(x)dx = 1 Professor Sharabati (Purdue University) Spring 2014 (Slide 5 of 37)

Probability Density Function (PDF) Professor Sharabati (Purdue University) Spring 2014 (Slide 6 of 37)

Interpretations of f(x) The density function f(x) gives us an idea about the distribution of probability density instead of probability itself. 1 For any c, P (X = c) = 0, i.e., the probability that X takes any specific value is 0. 2 We can only look at the probability that X falls on a specific interval. This is given by the integration of f(x). 3 For any two numbers a and b with a < b, P (a X b) = P (a < X b) = P (a X < b) = P (a < X < b) = b a f(x)dx. Professor Sharabati (Purdue University) Spring 2014 (Slide 7 of 37)

Pdf Example - Uniform Bus comes every 30 minutes, let X = waiting time till a bus comes. The pdf of X is: f(x) = 1 30, 0 x 30. What is the probability that waiting time is longer than 5 minutes? What is the probability that the waiting time is between 5 and 10 minutes? In general Given b > a, X with pdf: f(x) = 1 b a, a x b is said to have uniform distribution. Professor Sharabati (Purdue University) Spring 2014 (Slide 8 of 37)

Pdf Example - Exponential Let X = the life span of some bacteria (in hours). X is a continuous rv, the pdf is give as: f(x) = 2e 2x, x 0 What is the probability that the bacteria lives over 2 hours? What is the probability that the bacteria dies within an hour? In general given λ > 0, X with pdf f(x) = λe λx, x 0 is an rv with exponential distribution. Professor Sharabati (Purdue University) Spring 2014 (Slide 9 of 37)

Cumulative Distribution Function, Expectation, Variance and Percentile Cumulative distribution function (cdf) for continuous rv Definition and interpretation Relationship between pdf and cdf Examples Expectation and variance of continuous rv Definition Examples Percentile Definition and interpretation Examples Professor Sharabati (Purdue University) Spring 2014 (Slide 10 of 37)

Cumulative Distribution Function (CDF) Definition The cumulative distribution function F (x) for a continuous rv X is defined for every number x by: F (x) = P (X x) = x f(y)dy i.e., F (x) is the area under f(x) to the left of x. We have: 1 0 F (x) 1 2 F (x) is non-decreasing. Professor Sharabati (Purdue University) Spring 2014 (Slide 11 of 37)

F (x) and f(x) From the definition of cdf, we can easily derive: P (X x) = F (x) = x f(y)dy f(x) = F (x), for which the derivative F (x) exists. For a < b, P (a < X < b) = b f(x)dx = F (b) F (a) a P (X > a) = f(x)dx = 1 F (a) a Professor Sharabati (Purdue University) Spring 2014 (Slide 12 of 37)

Finding F (x) and use F (x) to Compute Probabilities Uniform cdf: Find the cdf F (x) for the uniform distribution: f(x) = { 1 10 2 x 12 0 otherwise What is P (x < 6)? What is P (x > 3)? Hint In general, for uniform f(x) = { 1 b a a x b 0 otherwise The cdf is given by: 0, x < a x a F (x) = b a a x < b 1 x b Professor Sharabati (Purdue University) Spring 2014 (Slide 13 of 37)

Example Continued Exponential cdf: Find the cdf F (x) for the exponential distribution f(x) = λe λx, λ > 0, x 0. What is P (X > a)? What is P (a < X < b)? Hint The general form of an exponential cdf is: { 0 x < 0 F (x) = 1 e λx x 0 Professor Sharabati (Purdue University) Spring 2014 (Slide 15 of 37)

Expectation of Continuous rv Definition (Expectation) The expectation or mean value of a continuous rv X with pdf f(x) is defined as: E(X) = µ X = x f(x)dx Expectation for continuous rv is an integration instead of a summation, it is a measure of the center of the distribution. Professor Sharabati (Purdue University) Spring 2014 (Slide 17 of 37)

Properties of Expectation for Continuous rv 1 E(aX + b) = ae(x) + b 2 E(a 1 X 1 + a 2 X 2 +... + a n X n ) = a 1 E(X 1 ) + a 2 E(X 2 ) +... + a n E(X n ) 3 Expectation of function of X: if h(x) is any function of X, expectation of h(x) is: E[h(X)] = µ h(x) = h(x) f(x)dx Professor Sharabati (Purdue University) Spring 2014 (Slide 18 of 37)

Examples of Expectations Uniform expectation Find the expectation of the uniform rv with pdf: { 1 f(x) = b a a x b 0 otherwise Answer: E(X) = a + b 2 Exponential expectation Find the expectation of the exponential rv with parameter λ. Answer: E(X) = 1 λ Professor Sharabati (Purdue University) Spring 2014 (Slide 19 of 37)

Examples Continued... Find E(X 2 ) for uniform distribution with parameters a, b. Answer: E(X 2 ) = a2 + ab + b 2 3 Find E(X 2 ) for exponential distribution with parameter λ,i.e., f(x) = λe λx, λ > 0, x 0. Answer: E(X 2 ) = 2 λ 2 Professor Sharabati (Purdue University) Spring 2014 (Slide 20 of 37)

Variance of Continuous rv Definition (Variance) The variance of a continuous rv X with pdf f(x) and expectation E(X) is: V ar(x) = Standard deviation of X is: V ar(x) (x E(X)) 2 f(x)dx = E[(X E(X)) 2 ] Variance of continuous rv is an integration instead of a summation, it is a measure of the spreadness of the distribution. Properties of Variance: 1 V ar(ax + b) = a 2 V ar(x) 2 V ar(x) = E(X 2 ) (E(X)) 2 = E(X 2 ) µ 2 X Professor Sharabati (Purdue University) Spring 2014 (Slide 21 of 37)

Examples of Variances Variance of uniform: f(x) = Answer: Find the variance of uniform: { 1 b a a x b 0 otherwise V ar(x) = (b a)2 12 Variance of exponential: with parameter λ. Find the variance of exponential Answer: V ar(x) = 1 λ 2 Professor Sharabati (Purdue University) Spring 2014 (Slide 22 of 37)

Percentiles of a Continuous Distribution Definition Let p be a number between 0 and 1. The (100p)th percentile of the distribution of a continuous rv X, denoted η(p), is defined by: p = F (η(p)) = η(p) f(y)dy η(p) is the value on the measurement axis such that 100p% of the area under the graph of f(x) lies to the left of η(p) and 100(1 p)% lies to the right. For example, η(0.8), the 80th percentile, means that 80% of all population are below η(0.8) and 20% are above. Professor Sharabati (Purdue University) Spring 2014 (Slide 23 of 37)

Median of a Continuous rv: 50th Percentile Definition The median of a continuous distribution (denoted µ, is the 50th percentile. That is: 0.5 = F ( µ) = µ f(y)dy i.e., median divides the pdf into two halves with equal area. Professor Sharabati (Purdue University) Spring 2014 (Slide 24 of 37)

Exercise Find the 50th percentile of the pdf given below: { 3 f(x) = 2 (1 x2 ) 0 x 1 0 otherwise Professor Sharabati (Purdue University) Spring 2014 (Slide 25 of 37)

Normal Distribution Normal pdf Standard Normal, pdf and cdf Normal table z α notation Non-standard Normal Examples Professor Sharabati (Purdue University) Spring 2014 (Slide 26 of 37)

Normal Distributions Definition A continuous rv X is said to have a normal distribution with parameters µ and σ, where < µ < and σ > 0, if the pdf of X is f(x; µ, σ) = 1 σ (x µ) 2 2π e 2σ 2, < x <. E(X) = µ. V ar(x) = σ 2 and thus std dev= σ. Professor Sharabati (Purdue University) Spring 2014 (Slide 27 of 37)

The standard normal density curve is called z curve. z curve is bell shaped, symmetric wrt y axis. Φ(z) gives the area under the normal density curve from to the number z. Professor Sharabati (Purdue University) Spring 2014 (Slide 28 of 37) & Probability Distributions Standard Normal Distribution Definition The normal distribution with parameter values µ = 0 and σ = 1 is called a standard normal distribution. The standard normal rv is denoted by Z. pdf is: f(z) = 1 2π e z2 2, < z < The cdf, denoted by Φ(z) (instead of F (z)) is: Φ(z) = P (Z z) = z f(y)dy

Standard Normal Distribution Professor Sharabati (Purdue University) Spring 2014 (Slide 29 of 37)

Standard Normal Table There is no closed form for Φ(z), so standard normal cdf values have been tabulated using numeric methods. Let Z be a standard normal rv, find the following using the standard normal table: 1 P (Z 0.85) P (Z 0.85) = Φ(0.85) 2 P (Z > 1.32) P (Z > 1.32) = 1 P (Z < 1.32) = 1 Φ(1.32) 3 P ( 2.1 < Z < 1.78) P ( 2.1 < Z < 1.78) = P (Z < 1.78) P (Z < 2.1) = Φ(1.78) Φ( 2.1) Professor Sharabati (Purdue University) Spring 2014 (Slide 30 of 37)

Another Example Let Z be a standard normal rv, find z when: 1 P (Z < z) = 0.9278 P (Z < z) = Φ(z) = 0.9278, look for 0.9278 in table, and find z accordingly. 2 P ( Z < z) = 0.8132 P ( z < Z < z) = P ( z < Z < 0) + P (0 < Z < z) = 2P (0 < Z < z) = 2(Φ(z) Φ(0)) = 2(Φ(z) 1 2 ) = 2Φ(z) 1 = 0.8132, thus Φ(z) = 0.9066, look for 0.9066, and find z accordingly. in the table Professor Sharabati (Purdue University) Spring 2014 (Slide 31 of 37)

z α Notation Later when we discuss inferential statistics, we will need values on the measurement axis that capture small tail areas under the normal curve, this is denoted z α : z α denote the value on the measurement axis for which α of the area under the z curve lies to the right of z α. 1 α is the area lies to the left of z α under the z curve. i.e., z α is the 100(1 α)th percentile of the standard normal dist. z curve is symmetric wrt y axis, so area to the left of z α is also α. Example z is usually referred to as z critical values. What is z 0.05? It is the?-th percentile? Professor Sharabati (Purdue University) Spring 2014 (Slide 32 of 37)

Nonstandard Normal Distributions Proposition If X N(µ, σ 2 ), then Z = X µ σ has a standard normal distribution, thus ( a µ P (a X b) = P Z b µ ) ( ) ( ) b µ a µ = Φ Φ σ σ σ σ ( ) ( ) a µ b µ P (X a) = Φ, P (X b) = 1 Φ σ σ Professor Sharabati (Purdue University) Spring 2014 (Slide 33 of 37)

Empirical Rule Nonstandard normal curve: 1 Roughly 68% of the values are within σ of the mean. 2 Roughly 95% of the values are within 2σ of the mean. 3 Roughly 99.7% of the values are within 3σ of the mean. Professor Sharabati (Purdue University) Spring 2014 (Slide 34 of 37)

Exercise of Nonstandard Normal Reaction time for an in-traffic response to a brake signal from standard brake lights can be modelled with a normal with mean 1.25 sec and std dev 0.46 sec. What is the probability that reaction time is between 1.00 and 1.75 sec? Professor Sharabati (Purdue University) Spring 2014 (Slide 35 of 37)

Percentiles of an Arbitrary Normal The (100p)th percentile of a normal distribution with mean µ and standard deviation σ can be easily transformed from the percentile of a standard normal. (100p)th percentile for N(µ, σ 2 ) = µ + (100p)th percentile for N(0, 1) σ Example What is the 95th percentile of N(µ = 2, σ = 4.5)? Professor Sharabati (Purdue University) Spring 2014 (Slide 36 of 37)

Normal Approximation to Binomial Let X be a binomial rv based on n trials, each with probability of success p. Check the binomial pmf (histogram) is not too skewed, X has approximately a normal distribution, with µ = np and σ = np(1 p). ( ) x + 0.5 np P (X x) = Φ np(1 p) In practice, the approximation is adequate provided that both np 10 and n(1 p) 10 Professor Sharabati (Purdue University) Spring 2014 (Slide 37 of 37)