STAT 3610: Review of Probability Distributions

STAT 3610: Review of Probability Distributions Mark Carpenter Professor of Statistics Department of Mathematics and Statistics August 25, 2015

Support of a Random Variable Definition The support of a random variable, say X, denoted as X, is defined to be the set of all points on the real line for which the pdf/pmf is non-zero. That is, where the braces indicates a set. X = {x R : f X (x) > 0},

Support of a Random Variable The support of a random variable is usually denoted by the script form of the letter corresponding to random variable the random variable X has support X the random variable Y has support Y the random variable Z has support Z The support of a random variable is one of the first characteristics we can use to help identify the distribution of a random variable.

Continuous versus Discrete Random Variables Definition Continuous Random Variable: A random variable is said to be continuous if its support is a continuous set (made up of unions and intersections of real intervals). The CDF* of a continuous random variable must be a continuous function on the real line Definition Discrete Random Variable: A random variable is said to be discrete if its support is a discrete set. The CDF* of a discrete random variable is not continuous, but it is right continuous. *CDF stands for Cumulative Distribution Function

Difference between Discrete and Continuous Random Variables So, whether the random variable, X, is a continuous or discrete random variable depends on whether its support is continuous or discrete. In the discrete case, the pmf* is a discontinuous function with a positive mass (probability) at each point in the support. In the continuous case, the pdf** itself does not have to be a continuous function everywhere, but it usually a continuous function on intervals in the support. *pmf stands for probability mass function **pdf stands for probability density function

Exponential Random Variable and Exponential Distribution Example 1: (continuous support) Suppose X exponential (θ). From Section 3.2 (pp. 95-113) of the textbook, the exponential distribution, indexed by the scale parameter θ (θ > 0) is f (x; θ) = 1 θ e x/θ I [0, ) (x) = { 1 θ e x/θ x 0 0 otherwise. which means {x R : f (x) > 0} = [0, ) and the support of X is X = [0, ), a continuous set. We see that the pdf for the exponential is zero for all points below zero, then jumps to λe 0 = λ at x = 0 and is continuous on [0, ).

pdf and cdf for the Standard Exponential 1 f (x) 0.5 F (a) = 1 e a 1 a x 1 F (x) 0.5 F (a) = 1 e a 1 a x

Exponential is special case of Gamma and Weibull You can verify that the non-truncated gamma and Weibull distributions, from which the exponential is a special case, share this same support. If X is a normal random variable, then the support is X = (, ) = R.

Mean or Expected Value of a Random Variable Recall, for any random variable, X, with pdf/pmf f (x), a measure of central tendency of the population is the population mean µ, or the expected value/long run average for X. More formally, Population Mean: For any random variable, X, with pdf/pmf f (x), the population mean µ = E(X ) where xf (x)dx if X is continuous µ = E(X ) = xf (x) if X is discrete x X

Population Variance or Variance of a Random Variable Population Variance: For any random variable, X, with pdf/pmf f (x), the population variance is σ 2 = E(X µ) 2, where (x µ) 2 f (x)dx if X is continuous σ 2 = E(X µ) 2 = (x µ) 2 f (x) if X is discrete x X

Sometimes Easier Way to Compute Population Variance Note that it is often easier to compute the variance by noting that, σ 2 = E(X µ) 2 = E(X 2 2X µ+µ 2 ) = EX 2 2µEX +µ 2 = EX 2 (EX ) 2. So, rather than going through the original express, one need only compute E(X 2 ) and µ = E(X ) and plug the results in to the following expression σ 2 = E(X 2 ) µ 2.

Expectations of Functions of a Random Variable You might notice that each of E(X ), E(X 2 ), and E(X µ) 2 are the expected value of different functions, g 1 (x) = x, g 2 (x) = X 2 and g 3 (x) = (x µ) 2. The expected value for any function is defined below.

Moment Generating Functions Whenever it exists, the moment-generating function for a random variable X, denoted M X (t), is the continuous function of t (, ) given as [ ] M X (t) = E e tx, t ( h, h), h > 0. The interval ( h, h) is referred to as the radius of convergence.

Properties of a Moment Generating Function (mgf) This function is called the moment-generating function because you can find the n th moment for the random variable, X, by computed its n th derivative with respect to t then setting t = 0, as follows E(X n ) = M (n) X (0) = d dt M X (t). t=0 Notice that the moment generating function is a continuous and differentiable function of t < h, whether or not X is continuous. In fact, the moment generating function is mathematically independent of the original variable (since it was integrated or summed over the support) and only relates to the variable X through the moments of the distribution and any related parameters.

Properties of Exponential We will show on chalkboard that if X Exp(θ) then f (x)dx = µ = E(X ) = 0 σ 2 = E(X µ) 2 = 1 θ e x/θ dx = 1. x f (x)dx = 0 0 x θ e x/θ dx = λ. (x µ) 2 f (x)dx = θ 2 Cumulative Distribution Function (cdf) for any w 0 is F (w) = P(X w) = 1 e w/θ The moment generating function (mgf), denoted M(t) exists and 1 M(t) = (1 θt), t < 1 θ

Discrete Example (Binomial) Example: (discrete support) Suppose Y is a Binomial random variable with parameters n and p (see page 117 of textbook),then the probability mass function (pmf) is f Y (y) = ( n y ) p y (1 p) n y y = 0, 1,..., n 0 otherwise which means {y : f (y) > 0} = {0, 1,..., n} and the support of Y is Y = {0, 1,..., n}, a discrete (and finite) set of points. Recall that, ( n y ) = n! y!(n y)!.

cdf for Binomial Random Variable Example 2 (binomial): Suppose X is a Binomial (4, 0.5), then the pmf is ( ) 4 1 f (x; n = 4, p = 1/2) =, x X = {0, 1, 2, 3, 4} x 2n Table : CDF for Binomial (n=4, p=1/2) (, 0) P(X < 0) = 0 0 = 0 [0, 1) P(X 0) = P(0) 1 = 1 [1, 2) P(X 1) = P(0) + P(1) 1 + 4 [2, 3) P(X 2) = P(0) + P(1) + P(2) 1 + 4 + 6 [3, 4) P(X 3) = P(0) + P(1) + P(2) + P(3) 1 + 4 + 6 + 4 [4, ) P(X 4) = P(0) + P(1) + P(2) + P(3) + P(4) 1 + 4 + 6 + 4 + 1 = 5 = 11 = 15 = 1

cdf for Binomial Random Variable 15/ 1 11/ 5/ 1/ 0 1 2 3 4 5

Binomial, hypergeometric, geometric One can verify that the supports for the negative binomial (r, p), the Geometric(p), and the Poisson random variables (see page 126) are the same countably infinite set {0, 1, 2,...}. The support of the hypergeomtric(n, M, N) is discrete/finite set {max(0, n N + M),..., min(n, M).}

Poisson Random Variable Suppose X is a discrete random variable with a Poisson (λ) distribution, then the probability mass function (pmf) is λ x e λ x = 0, 1, 2,..., f (x) = x! 0 otherwise where λ > 0. Recall that the MacLaurin series of an Exponential function, e y, is e y y i =, where i! represents the factorial function for an i! i=0 integer i.

Some Properties of a Poisson Random Variable λ x e λ f (x; λ) = = 1 x! x X x=0 µ = E(X ) = x f (x) = λ x X σ 2 = E(X µ) 2 = x X(x µ) 2 f (x) = λ M(t) = E(e tx ) = x X e tx f (x) = e λ(et 1), < t <.