1 Chapter 5 Means and Variances Our discussion of probability has taken us from a simple classical view of counting successes relative to total outcomes and has brought us to the idea of a probability density function, pdf. We know now that to use probability consistently in our applications we must have a pdf for any random variable we are considering. The pdf is a function which allows us to make all probability statements about the random variable under consideration. Of course, this pdf must satisfy two conditions. The pdf must be non-negative and sum to 1 over all values in the support. Despite the fact that the pdf is all we need to discuss probability, many times we would like to have a few indicators that tell us how a pdf is constructed; how it is distributed out over the support. The two most famous indicators of the way a pdf consists are its mean and variance. 1 The mean is usually called a measure of central tendency. It is a weighted average of the support using the associated density (pdf) as the system of weights. Remember that support refers to the total set of s where the pdf, f(s), is positive. The mean is not computed from data; it is computed from the pdf. Sometimes the mean is further clarified by calling it the "population mean". If you have a pdf, then you can compute that pdf's mean, and that will be the mean of the random variable associated with the pdf. The population mean is always computed according to a formula involving the pdf. Below are the formulas for both discrete and continuous pdfs, respectively. These formulas show that the mean of a random variable is computed by taking each s in the support, multiplying by the pdf, f(s), and then adding things up (by sum or by integration). 1 It is possible to have pdf's that do not have a variance or even a mean. These are unusual cases that one would deal with in higher level courses. All of the densities we will consider in this course have means and variances.
The variance of a random variable is commonly referred to as a measure of dispersion. It measures the extent probability is dispersed over the support for a fixed mean. Probability can be thought of as being spread all over the support. If the pdf were centered only on one sf( s) i i support support sf () s ds point, then there would be no randomness. In a sense, greater dispersion about the same mean means greater randomness. That is, holding the mean constant, a greater variance implies greater randomness. The formulas used to calculate variances for discrete and continuous pdfs are as follows σ ( s ) f( s ) i i support σ ( s ) f () s ds support These formulas above look complicated, but they are no more difficult to compute than the probabilities in Chapters 3 and 4. Again, we can use our phone software or we can do the summations and integrations on the computer or online. From the definitions of the mean and variance above we see that the mean is the expected value of the random variable. This can be written as Mean of = = E [ ] Variance of = σ = E[( ) ] The expected value E[] has some important properties. For example, if 1 3 N are as collection of independent random variables (called a random sample). Then
3 E[1 + + + N] = E[1] +E[] + + E[N] = N and var[1 + + + N] = var[1] +var[] + + var[n] = Nσ Another important relation is the following var( ) E[( ) ] = E [ ] { E [ ]} Discrete Examples: Our discrete densities are Bernoulli, Binomial, Geometric, and Poisson. Let's calculate the mean and variance for each. (1) Bernoulli density The Bernoulli is very easy. We can do it without using much formal mathematics. Let the density be described as f(0) = (1 p) and f(1) = p. Now using the formula for the mean we have is while the variance is = 0 f(0) + 1 f(1) = p σ = (0 ) f(0) + (1 ) f(1) = p (1 p) + (1 p) p = p(1 p) Hence, the mean of a Bernoulli random variable is = p and the variance is σ = (1 ). This means that if we know p, we know both the mean and the variance of p p the Bernoulli. For example, if p = 1/, then the mean is 1/ and the variance is 1/4.
4 () Binomial density The Binomial random variable can be assumed to be constructed from the sum of N independent Bernoulli random variables. Therefore, we can use what we know about the Bernoulli mean and variance and our laws of expectations to derive the mean and variance of the Binomial random variable. Mean of Bn(N, p) = = Np and Variance of Bn(N, p) = σ = Np(1 p) (3) Geometric density k Let's suppose the geometric density is defined as f( k) = (1 p) p for k = 0, 1,,... and for 0 p 1. This is an easy density to graph. The mean of a geometric distributed random variable can be written as 0 1 = (1 p){0 p + 1p + p + } = 1 p p while the variance can be written p σ = (1 p). Notice how that once we know the value of p we know both the mean and the variance. For example, if p = 1/, then the mean of the geometric random variable is 1 while the variance is. The derivation of these results requires deeper mathematical skills than an introductory course without calculus covers.
5 (4) Poisson density The geometric density is defined as λ λ k e f( k) = for k = 0, 1,,... and for λ > 0. k! It is very interesting that the mean and variance of the Poisson random variable are equal to each other and can be written as while the variance can be written 3 = λ σ = λ. On the next page we show the graphs of the probability densities (pdf) for the four major discrete random variables using selected parameter values. The binomial looks as if it is continuous and not discrete, but this is just the way it is plotted. It should be plotted using spikes instead of smooth curves. All four are examples of pdfs and can be used to compute probabilities. Each of these densities can be used to compute the mean and variance for that density. We now turn to the continuous densities to look at the means and variances. 3 An interesting ratio can be made by taking the standard deviation and dividing by the mean. This is called the coefficient of variation and it can be estimated, if we have data. The theoretical or population coefficient of variation for the Poisson random variable is equal to CV var( ) λ 1 = = = mean( ) λ λ which says that as λ increases, the theoretical coefficient of variation falls, and thus becomes "less random" as λ increases. This is an example of a random variable whose risk falls as its mean rises, so long as we define risk as the coefficient of variation. Note that the CV is only defined for random variables with positive means.
6 Continuous Examples: Our continuous densities are the Normal, Negative Exponential, Student's t, and Chi-Squared densities. There are many other densities, but these three are the ones that are discussed and used most often. (1) Normal density This is the famous bell-shaped curve. The formula was written earlier and can be repeated here again for convenience. This density is symmetrical and is centered at where s =.
7 1( s ) = 1 e πσ ( ) σ for - < s < f s Thus, the mean of a normal is just the parameter. The second parameter in the density is σ and this corresponds to the variance in a Normal density. Thus, for the Normal density, denoted N ( σ, ), the parameters are just the mean and variance. Here is an example. Consider this specific N (,3) density. Here is another one. 1( s ) 3 f( s) = 1 e for - < s <. 6π 1 s f( s) = 1 e for - < s <. π This last density is a N (0,1). It is called the Standard Normal because the mean is 0 and the variance is 1. The standard normal is the most important and practical density among all the densities. It is used hundreds of thousands of times a day throughout the world. Here is a table on the next page listing the areas to the left for various values of z. Why is it that this density stands out among all the others? The reason is because many things in life are either-or. That is, at a fundamental level either something is or something isn't. We have seen that this is the basis for the Bernoulli random variable which takes the values 0 or 1 with probabilities p and (1-p), respectively. Now, suppose you have N independent Bernoulli random variables. We can call them 1,,, N. Add these together and you get YN = 1++ +N. But, YN is exactly a Binomial random variable Bn(N,p) with mean Np and variance Np(1-p). Form the random variable, Z, from this Binomial as follows Z = ( Y Np) / Np(1 p). As the N gets bigger N N N and bigger (which is the way Life is), this random variable will come closer and closer to a standard normal random variable. So, if we add together a lot of little Bernoulli effects,
8 the average of these effects will be approximately normal. In fact, we need not begin with Bernoulli random variables. An important theorem later on says we can take any set of independent N random variables (under certain conditions) and we will get the same result. This is an amazing and very useful result called the Central Limit Theorem, which we will study later on. It can help us to do approximate analyses that are simple and easy to understand. (P1) Suppose that the random variable is distributed as a Negative Exponential density f s 5x ( ) 5 e for x 0 =. Use your phone software to show this density integrates to 1. Then find the mean and the variance.
9 (P) Suppose that the random variable is distributed as a Negative Exponential density f s =. Use your phone software to show this density integrates 6x ( ) 6 e for x 0 to 1. Then find the mean and the variance. (P3) Suppose that the random variable is distributed as a Negative Exponential density f s 5x ( ) 5 e for x 0 =. Use your phone software to calculate P [ 0.5]. Now find P [ 0.40]. (P4) Suppose that the random variable is distributed as a Bernoulli density f(0) = p = 0.5 and f(1) = (1 p) = 0.75. Show that f(0) + f(1) = 1 and that f(0) and f(1) > 0. Therefore, f(s) is truly a probability density. Now find the mean and variance of. (P5) Suppose that you flip a fair coin 3 times. If head then s = 1, if tail then s = 0. Now add the 3 outcomes to get the random variable. Write down the probability density for the random variable. Use a tree diagram to help you. (P6) Suppose that you have the Uniform density f(s) = 1/4 for 0 s 4. Find the probability P [ 3]. Find the mean and variance of. (P7) Use your phone software to find 0 1( x ) 1 4 e dx. Find the mean and the π 4 0 variance. What is the standard deviation for? Finally find 4 1( x ) 1 4 e dx and π 4 0 6 1( x ) 1 4 e dx. You can use my phone in class if you do not have one. π 4