Continuous Probability Distributions

Chapter 5 Continuous Probability Distributions 5.1 Probability density function Example 5.1.1. Revisit Example 3.1.1. 11 12 13 14 15 16 21 22 23 24 25 26 S = 31 32 33 34 35 36 41 42 43 44 45 46 (5.1.1) 51 52 53 54 55 56 61 62 63 64 65 66 Pr(11) = Pr(12) =... = Pr(66) = 1/36. Chapter 5 2

If X=sum of the two numbers. Then X can take values 2, 3,..., 12 with probabilities 1/36, 2/36, 3/36, 4/36, 5/36, 6/36, 5/36, 4/36, 3/36, 2/36, and 1/36 respectively. Now suppose I asked our own Peter to pick a point on a 6 8 rectangle. What is the probability that he picks the point (2,1)? What is the probability that he picks the point (0,0)? What is the probability that he picks the point (2,1)? = 1 = 0. What is the probability that he picks the point (0,0)? = 1 = 0. Let me ask this now: What is the probability that Peter picks a point in the first half of the rectangle? Chapter 5 3

What is the probability that he picks a point in the middle third of the rectangle? What is the probability that Peter picks a point in the first half of the rectangle? Pr(a point is picked in the first half of the rectangle) = 1/2. What is the probability that he picks a point in the middle third of the rectangle? Pr(a point is picked in the middle third of the rectangle) = 1/3. Therefore, even though the probability of each sample point is zero, we can define probabilities of specific events. What assumption did we make in calculating the above probabilities? What if that assumption is violated? Chapter 5 4

Example 5.1.2. Menstrual cycle example(revisited) Histogram of Menstrual Cycle R elative F req u en cy 40 30 20 10 0 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 Time (days) Figure 5.1: Distribution of time intervals between successive menstrual periods (days) of college women (Table 2.3; Rosner; Page 13). Mean=28.5; Median=28; Mode=28. If a we randomly pick a college woman, what is the probability that her menstrual cycle will be longer than 28 days? If a we randomly pick a college woman, what is the probability Chapter 5 5

that her menstrual cycle will be between 28 and 32 days? Can we approximate these probabilities by using the smooth curve? Can we find a function (formula) that represents the smooth curve? Probability density function Probability density function (pdf) of a random variable X is a function (curve) such that the area under the density function between any two values a and b is equal to the probability that the random variable X falls between a and b. Probability density function satisfies: (i)it is non-negative, and (ii) total area under the curve is 1. Chapter 5 6

Cumulative distribution function The cumulative distribution function (cdf) of a random variable X evaluated at x is given by F(x) = Pr(X x). Cumulative distribution function F(x) satisfies: (i) F(x) is non-decreasing in x, and (ii) F( ) = 0, (iii) F( ) = 1, (iv) Pr(a < X b) = F(b) F(a). Mean and variance of a continuous random variable µ = E(X), or σ 2 = V ar(x) = E(X 2 ) µ 2, Chapter 5 7

5.2 Normal Distribution Normal distribution is the most widely used continuous distribution. It is symmetric, bell-shaped. Between one standard deviations of mean lies about 68% of the area under the curve. Between two standard deviations of mean lies about 95% of the area under the curve. Between three standard deviations of mean lies about 99.7% of the area under the curve. Mean = Median = Mode. Has the density function f(x) = 1 σ 2π exp [ 1 (x µ)2 2σ2 ], < x <. (5.2.1) Only two parameters µ and σ 2 characterizes the whole distribution. Chapter 5 8

A normal distribution with parameters µ and σ 2 is denoted by N(µ, σ 2 ). If X N(µ, σ 2 ), then Pr(X µ) = Pr(X µ) = 1 2. The center (mean) of the distribution is µ and the variance of the distribution is σ 2. i.e., E(X) = µ, V ar(x) = σ 2.. Standard normal distribution Standard normal distribution is a special case of normal distribution with mean µ = 0 and variance 1. Convention is to denote it by Z. If the random variable Z follows a standard normal distribution[n(0, 1)], then Pr( 1 Z 1) = 0.68, Chapter 5 9

Pr( 2 Z 2) = 0.95, Pr( 3 Z 3) = 0.997. pdf of Z, φ(z) = 1 2π exp ( z2 2 ), < Z <. (5.2.2) cdf of Z, Φ(z) = Pr(Z z) = z 1 2π exp ( z2 2 ) dz, < Z <. (5.2.3) Note that, Φ( z) = 1 Φ(z). Use Table 3 in the Appendix of FOB to calculate the cdf for specific values of z. For instance, Pr(Z.84) = Φ(.84) = 0.7995, Pr(Z.84) = Φ(.84) = 1 Φ(0.84) = 1 0.7995 = 0.2005, Pr(.12 < Z.24) = Φ(.24) Φ(.12) =.5948 0.5478 = 0.0470, Chapter 5 10

P r( 1.96 < Z.84) = Φ(.84) Φ( 1.96) =.2005 0.0250 = 0.1755, Pr( 1.96 < Z 1.96) = Φ(1.96) Φ( 1.96) =.9750 0.0250 = 0.9500, Φ(z) can also be calculated using MS-Excel function Normdist(z,µ,σ,True). For example, Pr(Z.84) = Normdist(.84, 0, 1, True) = 0.799546. Percentiles of a standard normal distribution The 100pth percentile of a N(0, 1) random variable z p is given by Pr(Z z p ) = Φ(z p ) = p. We can calculate the percentiles of standard normal distributions from Table 3 (Appendix, FOB). To find the 95th percentile of standard normal distribution, we look for the probability 0.95 in Column A. Corresponding Chapter 5 11

value of x is z.95, the 95th percentile. It is between 1.64 and 1.65, and we take it approximately to be z.95 = 1.645. Using the MS-Excel function NORMSINV(0.95) we obtain z.95 = 1.64485. To find the 25th percentile of standard normal distribution, we first notice that (p, the area under the curve is less than 0.5). Therefore, corresponding percentile should be a negative number. What number should it be? Look for the probability 1-0.25 = 0.75 in Column A. The negative of the corresponding value of x is z.25, the 25th percentile. It is between -.68 and -0.67, and we take it approximately to be z.25 =.675. Using the MS-Excel function NORMSINV(0.25) we obtain z.25 =.67449. Chapter 5 12

5.3 From Normal to Standard Normal Normal to Standard Normal If X N(µ, σ 2 ), then Z = X µ σ N(0, 1). This result is very important. A consequence is described in the following: Calculating probabilities under normal curve If X N(µ, σ 2 ) and Z N(0, 1), then ( ) ( ) Pr(a < X b) = Pr a µ σ < Z b µ σ = Φ b µ σ Φ ( ) a µ σ. Another important consequence is that the quantiles (percentiles) of normal distributions can be expressed as a linear function of standard normal percentiles. Chapter 5 13

Calculating percentiles of normal curve If x p and z p denote the 100pth percentiles of N(µ, σ 2 ) and N(0, 1), respectively, then, x p = µ + σz p. Example 5.3.1. Suppose that the diastolic blood pressures (DBP) for 35-to-44-year old men are normally distributed with mean 80mm Hg and variance 144mm Hg 2. If persons with DBP between 90 and 100mm Hg inclusive are considered to be moderately hypertensive, what is the probability that a randomly selected person from this population will be a mild hypertensive? Let X = DBP of a 35-to-44-year old man. Then, X N(80, 144). ( 90 80 Pr(90 X 100) = Pr X 80 12 12 = Pr (.883 Z 1.667) = Φ (1.667) Φ (.883) =.9522.7977 = 0.1545. ) 100 80 12 Chapter 5 14

If persons with DBP above 100mm Hg inclusive are considered to be hypertensive, what proportion of persons in this population are hypertensive? ( X 80 Pr(X > 100) = Pr > 12 = Pr (Z > 1.667) ) 100 80 12 = 1 Φ (1.667) = 1.9522 = 0.0478. Find the upper and lower 5th percentiles of this distribution. x.05 = µ + σz.05 = 80 + 12 ( 1.645) = 60.3mm Hg. x.95 = µ + σz.95 = 80 + 12 1.645 = 99.7mm Hg. Chapter 5 15

If a person s DBP belongs to the upper 25% of the distribution in this population he or she will be eligible for a clinical study related to hypertension. What is the minimum DBP level required to be eligible for the study? Find?? such that Pr(X >??) = 0.25. Or, Find x p such that Pr(X x.75 ) = 0.75. In other words, we need to calculate the 75th percentile of the distribution of X. We know that, x.75 = µ + σz.75 = 80 + 12 0.675 = 88.1mm Hg. Using MS-Excel function NORMINV(0.75,80,12) we obtain 88.09mm Hg. Chapter 5 16

If a 20 patients are selected at random, what is the probability that exactly 5 of them will be hypertensive (DBP > 100mm Hg)? Let Y = # of hypertensives out of 20. Then Y is Binomial with n = 20, p =.0478 (From page 15) Pr(Y = 5) = 20 (0.478) 5 (1 0.478) 1 5 5 =.0019. (5.3.1) If a 20 patients are selected at random, what is the probability that no more than 5 of them will be hypertensive (DBP > 100mm Hg)? 5 Pr(Y 5) = 20 (0.478) x (1 0.478) 20 x x x=0 =.9997. (5.3.2) Example 5.3.2. Continuity correction. Some random variables are usually measured only to the nearest integer. In such cases, Chapter 5 17

while calculating probability that the random variable takes a value in an interval an adjustment for the continuity needs to be made. Suppose that the age distribution of a group of patients being treated for HIV infection is normal with mean 46 years and standard deviation 5 years. What is the probability that a randomly selected individual in this group will be 60 or above? ( ) 59.5 46 Pr(X 60) = 1 Φ 5 = 0.0035. (5.3.3) What is the probability that a randomly selected individual in this group will be below 60? ( ) 59.5 46 Pr(X < 60) = Φ 5 = 0.9965. (5.3.4) What is the probability that a randomly selected individual in Chapter 5 18

this group will be above 50 but equal or below 60? Pr(50 < X 60) ( ) ( ) 60.5 46 50.5 46 = Φ Φ 5 5 = 0.1822. (5.3.5) Chapter 5 19

5.4 Covariance and Correlation 5.4.1 Covariance Covariance between two random variables measure how one variable varies relative to the other variable. Recall that the variance of a random variable X is defined as V ar(x) = σx 2 = E { (X µ x ) 2} = E(X 2 ) µ 2 x. (5.4.1) If we have two random variables X (height of a person) and Y (weight of the same person), we might be interested in seeing how values of X are associated the values of Y. This leads to the covariance Cov(X) = σ xy = E {(X µ x )(Y µ x )} = E(XY ) µ x µ y. (5.4.2) A positive σ xy indicates that the larger the value of X, the larger the value of Y, indicating a positive relationship. On the other hand, if larger (smaller) values of X are accompanied by smaller (larger) Chapter 5 20

values of Y, the covariance is negative. 5.4.2 Correlation The covariance indicates the direction of the association. Since the covariance can be any value between and, it is usually useless if we want to measure the strength of the association between two random variables. The correlation coefficient is used to measure the strength as well as the direction of the correlation. It is defined as: ρ = Corr(X, Y ) = σ xy σ x σ y. (5.4.3) Correlation coefficient ρ (i) 1 ρ 1, (ii) ρ = 0 No linear relationship, (ii) ρ = 1 Perfect negative linear relationship, (ii) ρ = +1 Perfect positive linear relationship. Chapter 5 21

5.4.3 Independence Two random variable X and Y are independent iff Pr(X x Y = y) = Pr(X x), for allx, y. (5.4.4) Given the knowledge about one variable does not change the distribution of the other. Is the distribution of weights for age-group 10-15 same as the distribution of weights for the age-group 30-35? (Age-group and Weight are not independent variables) Is there any reason to believe that the distribution of birthweights differ by the birth month? (Birthweight and birthmonth are independent random variables) Independence and correlation For independent random variables X and Y, (i) σ xy = 0, and (ii) ρ = 0. Chapter 5 22

On the other hand, if (i) σ xy = 0, and (ii) ρ = 0, X and Y may or may not be independent. 5.5 Linear combinations of random variables Frequently, we encounter situations where a new variable of interest is created by adding two or more random variables, subtracting one from the other, or taking linear combinations. Example 5.5.1. Let X 1, X 2, X 3, X 4, and X 5 are repeated cholesterol measurements over 5 different times during the day. We would like to know the average cholesterol level for that specific day. Thus the variable we are interested in is, X = X 1 + X 2 + X 3 + X 4 + X 5 5 = 1 5 X 1 + 1 5 X 2 + 1 5 X 3 + 1 5 X 4 + 1 5 X 5. (5.5.1) Chapter 5 23

Example 5.5.2. In the above example one might be interested in the difference between the fist and last cholesterol levels for that specific day. Thus the variable we are interested in is, D = X 5 X 1 = 1 X 5 + ( 1) X 1. (5.5.2) In general, we would like to know the distribution of a linear combination of random variables of the form Y = c 1 X 1 + c 2 X 2 +... + c n X n, (5.5.3) where c 1, c 2,..., c n are scalars and X 1, X 2,..., X n are random variables. Expected value of a linear combination of random variables If Y as defined in (5.5.3), and µ j = E(X j ), j = 1, 2,..., n, then µ y = E(Y ) = c 1 µ 1 + c 2 µ 2 +... + c n µ n. Thus, the expectation of sum of random variables is the sum of their expectations. That is, E(X 1 + X 2 +... + X n ) = E(X 1 ) + E(X 2 ) +... + E(X n ). (5.5.4) Chapter 5 24

The expectation of the difference between two random variables is the difference of their expectations. That is, E(X 1 X 2 ) = E(X 1 ) E(X 2 ). (5.5.5) Variance of the sum of two random variables If Y = X 1 + X 2, then σy 2 = V ar(y ) = V ar(x 1 ) + V ar(x 2 ) + 2Cov(X 1, X 2 ). Similarly, V ar(x 1 X 2 ) = V ar(x 1 ) + V ar(x 2 ) 2Cov(X 1, X 2 ). (5.5.6) If X 1 and X 2 are independent, then Cov(X 1, X 2 ) = 0. And therefore, V ar(x 1 + X 2 ) = V ar(x 1 X 2 ) = V ar(x 1 ) + V ar(x 2 ). (5.5.7) Chapter 5 25

Variance of the linear combination of independent random variables If X 1, X 2,..., X n are independent random variables, then V ar(y ) = V ar(c 1 X 1 + c 2 X 2 +... + c n X n ) = c 2 1V ar(x 1 ) + c 2 2V ar(x 2 ) +... + c 2 nv ar(x n ). Example 5.5.3. Suppose X 1 and X 2 respectively denote the preand post- treatment fasting triglyceride levels (mg/dl) for a randomly selected patient. If X 1 N(113, 225), and X 2 N(90, 196),what can we tell about the mean change? Let Y = X 2 X 1 be the change in the fasting triglyceride levels. Then, µ y = E(Y ) = E(X 1 X 2 ) = E(X 1 ) E(X 2 ) = 113 90 = 23mg/dl. 5.6 Normal approximation to Binomial distribution In many situations normal distribution arises from the sum of independent random variables. A binomial random variable Y with Chapter 5 26

parameter n and p can be treated as a sum of n independent random variables. It is easy to see, as each trial in the binomial experiment can be treated as independent binomial experiment. Let X 1 be the number of success in the 1st trial, X 2 be the number of success in the 2nd trial, and so on. Then, X i, i = 1, 2,..., n can take value either 0 (failure) or 1 (success). Each of these variables are referred to as Bernoulli random variables. Their sum Y = X 1 + X 2 +... + X n represents the number of successes in n trials. Thus a binomial random variable is the sum of independent Bernoulli random variables. Normal approximation to Binomial distribution If n is large and p is small so that np(1 p) 5, then B(n, p) distribution can be well-approximated by a N(µ = np, σ 2 = npq) distribution. Chapter 5 27

Example 5.6.1. What is the probability that the number of neutrophils will be between 50 and 75 out of 100 white blood cells where the probability that any one cell is a neutrophil is.6? Let X=number of neutrophil out of 100 white blood cells. Then X B(100,.6). Pr(50 X 75) = 75 x=50 100 x.6 x (1.6) 100 x = binomdist(75, 100,.6, true) binomdist(49, 100,.6, true) = 0.9826. (5.6.1) Now, as np(1 p) = 24 > 5, this should be well approximated by the N(µ = np = 60, σ 2 = npq = 24) distribution: ( ) ( ) 75.5 60 49.5 60 Pr(50 X 75) Φ Φ 24 24 = Φ(3.164) Φ( 2.143) = 0.983. (5.6.2) Chapter 5 28

5.7 Normal approximation to the Poisson distribution Normal approximation to Poisson distribution If µ is large (µ 10), then P(µ) distribution can be well-approximated by a N(µ, σ 2 = µ) distribution. Example 5.7.1. Suppose the number of accidents on the highland park bridge over a year follows a Poisson distribution with mean 20. What is the probability that there will be more than 5 accidents over a 6-month period? Using Poisson distribution with µ = 10, Pr(X > 5) = 1 Pr(X 5) = 0.9329. (5.7.1) Using Normal distribution with µ = 10 and σ 2 = 10, ( ) 5.5 10 Pr(X > 5) 1 Φ = 0.923. (5.7.2) 10 SOLVE PROBLEMS 5.99-5.101, 5.102-104. Chapter 5 29