Business Statistics 41000: Homework # 2 Solutions

Business Statistics 4000: Homework # 2 Solutions Drew Creal February 9, 204 Question #. Discrete Random Variables and Their Distributions (a) The probabilities have to sum to, which means that 0. + 0.3 + 0.4 + P (X = 0.) =. We can therefore deduce that P (X = 0.) = 0.20. (b) Here is a plot of the distribution..0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0. 0.00 0.0 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.0 0. 0.2 0.3 0.4 (c) The random variable X can take on two values greater than 0.05. We sum their probabilities: P (X > 0.05) = P (X = 0.07) + P (X = 0.0) = 0.4 + 0.2 = 0.6. (d) E[X] = 0.02 0. + 0.04 0.3 + 0.07 0.4 + 0. 0.2 = 0.002 + 0.02 + 0.028 + 0.02 = 0.062

(e) V [X] = 0. (0.02 0.062) 2 + 0.3 (0.04 0.062) 2 + 0.4 (0.07 0.062) 2 + 0.2 (0.0 0.062) 2 = 0. ( 0.042) 2 + 0.3 ( 0.022) 2 + 0.4 (0.008) 2 + 0.2 (0.038) 2 = 0. (0.008) + 0.3 (0.0005) + 0.4 (0.000064) + 0.2 (0.004) = 0.0008 + 0.0005 + 0.0000256 + 0.00028 = 0.000636 (f ) The standard deviation is the square root of the variance. We have σ X = 0.0259 σ 2 X = 0.00067 =

Question # 2. Discrete Random Variables and Their Distributions (a) The random variable X can only take on two outcomes which are and 0. We are rolling a fair die so that the probability P (X = ) is the same as the probability that we roll a six which is P (X = ) = 6. The probability that X = 0 is therefore 5 6. It is important that you recognize that X Bernoulli ( 6). We can write this in a two-way table: x p(x) 0 5 6 6 (b) We know that X Bernoulli ( 6). In class, we wrote out the formulas for the mean and variance of a Bernoulli (p) random variable. These are E[X] = p and V [X] = p( p). Now, we just plug p = 6 into these formulas to get E[X] = 6 and V [X] = 0.39.

Question # 3. Discrete Random Variables and Their Distributions (a) We just sum the probabilities between 0 and 0.5. P (0 < R < 0.5) = P (R = 0.0) + P (R = 0.05) + P (0.0) = 0.2 + 0.3 + 0.2 = 0.7. Notice that we do not include P (R = 0.5) because it says less than 0.5 not less than or equal to. (b) We want to compute P (X < 0) = P (X = 0.05) + P (X = 0.0) = 0. + 0. = 0.2. (c) Here, we apply the formula for the expected value that we gave in class. E[R] = 0. 0.05 + 0. 0.0 + 0.2 0.0 + 0.3 0.05 + 0.2 0.0 + 0. 0.5 = 0.005 0.00 + 0.002 + 0.05 + 0.02 + 0.05 = 0.046 (d) Here, we rst compute the variance and then just take the square root. V [R] = 0. ( 0.05 0.046) 2 + 0. ( 0.0 0.046) 2 + 0.2 (0.0 0.046) 2 +0.3 (0.05 0.046) 2 + 0.2 (0.0 0.046) 2 + 0. (0.5 0.046) 2 = 0. ( 0.096) 2 + 0. ( 0.056) 2 + 0.2 ( 0.036) 2 + 0.3 (0.004) 2 + 0.2 (0.054) 2 + 0. (0.04) 2 = 0. (0.009) + 0. (0.003) + 0.2 (0.003) + 0.3 (0.000002) + 0.2 (0.003) + 0. (0.008) = 0.0009 + 0.0003 + 0.00026 + 0.0000006 + 0.0006 + 0.0008 = 0.0034 Taking the square root we get σ X = σ 2 X = 0.0034 = 0.05604

(e) Here is a plot of the distribution..0 0.8 0.6 p(r) 0.4 0.2 0.050 0.025 0.000 0.025 0.050 0.075 0.00 0.25 0.50 r

Question # 4. Marginal and Conditional Distributions of Discrete Random Variables (a) Here is the tree diagram. 0.8 E = P(E = and G = ) = 0.5 * 0.8 = 0.4 0.5 0.5 G = (GOOD) 0.2 0.4 E = 0 P(E = 0 and G = ) = 0.5 * 0.2 = 0. E = P(E = and G = 0) = 0.5 * 0.4 = 0.2 G = 0 (BAD) 0.6 E = 0 P(E = 0 and G = 0) = 0.5 * 0.6 = 0 (b) Given the information from the tree diagram above, we can easily represent it as a two-way table. G 0 p E (e) E 0 0.3 0. 0.4 0.2 0.4 0.6 p G (g) 0.5 0.5 (c) Here, we are given P (E = G = ) but we want to know P (G = E = ). (NOTE: This is exactly the same problem as the example from Lecture #3 where we were testing for a disease.) However, we have already computed the joint distribution of (E, G) and it is simple to obtain the marginal distributions. Therefore, we can use our formula for the denition of

a conditional probability P (G =, E = ) P (G = E = ) = P (E = ) = 0.4 0.6 = 0.667 One nal comment to make. In this problem, we have implicitly used Bayes Rule.

Question # 5. Marginal and Conditional Distributions of Discrete Random Variables (a) To compute P (X 0., Y 0.), we need to add the probabilities. P (X 0., Y 0.) = P (X = 0.05, Y = 0.05) + P (X = 0., Y = 0.05) +P (X = 0.05, Y = 0.) + P (X = 0., Y = 0.) = 0. + 0.07 + 0.03 + 0.3 = 0.5 (b) To obtain the marginal distribution of X, we simply add each column downwards. X 0.05 0.0 0.5 Y 0.05 0.0 0.07 0.07 0.0 0.03 0.30 0.03 0.5 0.05 0.05 0.30 p X (x) 0.8 0.42 0.40 Notice that p X (x) is a probability distribution itself because p X (x) = P (X = 0.05) + P (X = 0.0) + P (X = 0.5) = 0.8 + 0.42 + 0.40 =. The probabilities sum to one. (c) To obtain the marginal distribution of Y, we simply add each column across. X 0.05 0.0 0.5 p Y (y) Y 0.05 0.0 0.07 0.07 0.24 0.0 0.03 0.30 0.03 0.36 0.5 0.05 0.05 0.30 0.40 Notice that p Y (y) is also a probability distribution itself because p Y (y) = P (Y = 0.05) + P (Y = 0.0) + P (Y = 0.5) = 0.24 + 0.36 + 0.40 =. The probabilities sum to one. (d) We know the joint probabilities and the marginal probabilites from our work above, therefore to determine the conditional distribution P (Y = y X = 0.5) we can use our formulas from

class. Since Y has three outcomes, there are three probabilities that we need to calculate. P (Y = 0.05, X = 0.5) P (Y = 0.05 X = 0.5) = P (X = 0.5) = 0.07 0.40 = 0.75 P (Y = 0., X = 0.5) P (Y = 0. X = 0.5) = P (X = 0.5) = 0.03 0.40 = 0.075 P (Y = 0.5, X = 0.5) P (Y = 0.5 X = 0.5) = P (X = 0.5) = 0.30 0.40 = 0.75 Notice that P (Y = y X = 0.5) is also a probability distribution itself because P (Y = y X = 0.5) = P (Y = 0.05 X = 0.5) + P (Y = 0. X = 0.5) + P (Y = 0.5 X = 0.5) = 0.75 + 0.075 + 0.75 =. The probabilities sum to one. (e) We know the joint probabilities and the marginal probabilites from our work above, therefore to determine the conditional distribution P (Y = y X = 0.05) we can use our formulas from class. Since Y has three outcomes, there are three probabilities that we need to calculate. P (Y = 0.05, X = 0.05) P (Y = 0.05 X = 0.05) = P (X = 0.05) = 0.0 0.8 = 0.555 P (Y = 0., X = 0.05) P (Y = 0. X = 0.05) = P (X = 0.05) = 0.03 0.8 = 0.67 P (Y = 0.5, X = 0.05) P (Y = 0.5 X = 0.05) = P (X = 0.05) = 0.05 0.8 = 0.278 Notice that P (Y = y X = 0.05) is also a probability distribution itself because P (Y = y X = 0.05) = P (Y = 0.05 X = 0.05) + P (Y = 0. X = 0.05) + P (Y = 0.5 X = 0.05) = 0.555 + 0.67 + 0.278 =. The probabilities sum to one. (f ) By comparing (d) and (e), the distributions are clearly not independent as the conditional dis-

tributions we calculated are not equal to one another nor are they equal to the marginal distribution p Y (y). To determine if it is positive or negatively correlated, compare the conditional distributions P (Y = 0.5 X = 0.5) vs P (Y = 0.5 X = 0.05) and P (Y = 0.05 X = 0.5) vs P (Y = 0.05 X = 0.05). Notice that conditional on X being small (large), the probabilities get larger for Y when it is also small (large). Therefore, it is positively correlated. (g) To calculate the mean and variance of Y, we need to use the marginal probability distribution from part (c). E[Y ] = 0.24 0.05 + 0.36 0.0 + 0.4 0.5 = 0.02 + 0.036 + 0.06 = 0.08 V [Y ] = 0.24 (0.05 0.08) 2 + 0.36 (0.0 0.08) 2 + 0.4 (0.5 0.08) 2 = 0.24 ( 0.058) 2 + 0.36 ( 0.008) 2 + 0.4 (0.042) 2 = 0.24 (0.0034) + 0.36 (0.00006) + 0.4 (0.008) = 0.00082 + 0.000022 + 0.00072 = 0.0056 (h) To calculate the conditional mean of the distribution P (Y = y X = 0.5), we use the probabilities from this distribution. E[Y X = 0.5] = 0.75 0.05 + 0.075 0.0 + 0.75 0.5 = 0.0088 + 0.0075 + 0.25 = 0.28 If we know X = 5% we would guess a higher value for Y because higher values for Y are more likely when X = 5%.

Question # 6. Sampling WITHOUT replacement and WITH replacment (a) The distribution of Y is Bernoulli(.5). y p(y ) 0 2 2 (b) P (Y 2 = y 2 Y = ) is the conditional distribution of the second voter given that we chose a Democrat on the rst pick: y 2 P (y 2 Y = ) 0 5 9 4 9 (c) and (d) To write out the joint probability distribution P (Y 2 = y 2, Y = y ) in a table we rst need to calculate the entries! P (Y =, Y 2 = ) = P (Y 2 = Y = )P (Y = ) = (4/9) (/2) = 0.22 P (Y =, Y 2 = 0) = P (Y 2 = 0 Y = )P (Y = ) = (5/9) (/2) = 0.28 P (Y = 0, Y 2 = ) = P (Y 2 = Y = 0)P (Y = 0) = (5/9) (/2) = 0.28 P (Y = 0, Y 2 = 0) = P (Y 2 = 0 Y = 0)P (Y = 0) = (4/9) (/2) = 0.22 Then, we just put these values in a table. We can calculate the marginal distributions easily by summing over rows and columns.

Y 2 0 p Y (y ) Y 0 0.22 0.28 0.50 0.28 0.22 0.50 p Y2 (y 2 ) 0.50 0.50 (e) Here, we can use our formulas relating joint distributions to conditional and marginal distributions, i.e. P (Y = y, Y 2 = y 2, Y 3 = y 3 ) = P (Y 3 = y 3 Y 2 = y 2, Y = y )P (Y 2 = y 2 Y = y )P (Y = y ). This formula is applied in the table. p(y, y 2, y 3 ) (0,0,0) (/2)*(4/9)*(3/8) = 0.083 (0,0,) (/2)*(4/9)*(5/8) = 0.39 (0,,0) (/2)*(5/9)*(4/8) = 0.39 (0,,) (/2)*(5/9)*(4/8) = 0.39 (,0,0) (/2)*(5/9)*(4/8) = 0.39 (,0,) (/2)*(5/9)*(4/8) = 0.39 (,,0) (/2)*(4/9)*(5/8) = 0.39 (,,) (/2)*(4/9)*(3/8) = 0.083 (f ) Now, we are sampling WITH replacement! The conditional distribution is: y 2 P (y 2 Y = ) 0 5 0 5 0 You should see that the marginal distribution of Y 2 and the conditional distributions of P (Y 2 = y 2 Y = 0) and P (Y 2 = y 2 Y = ) are all the same. This is one way we can tell that Y and Y 2 are independent.

Question # 7. Independence and Identical Distributions (a) We can quickly calculate the marginal distributions The conditional distributions are Y 2 0 p Y (y ) Y 0 0.067 0.233 0.30 0.233 0.467 0.70 p Y2 (y 2 ) 0.30 0.70 P (Y 2 = Y = ) = P (Y 2 =, Y = 0) P (Y = ) = 0.467 0.70 = 0.667 P (Y 2 = 0 Y = ) = P (Y 2 = 0, Y = ) P (Y = ) = 0.233 0.70 = 0.333 P (Y 2 = Y = 0) = P (Y 2 =, Y = 0) P (Y = 0) = 0.233 0.30 = 0.777 P (Y 2 = 0 Y = 0) = P (Y 2 = 0, Y = ) P (Y = 0) = 0.067 0.30 = 0.223 The conditional distributions P (Y 2 = y 2 Y = ) and P (Y 2 = y 2 Y = 0) are not the same. They depend on what we observe for Y. Therefore, they are NOT independent. (b) The marginal distributions p Y (y ) and p Y2 (y 2 ) are the same. Therefore, the random variables are identically distributed. They are not i.i.d. because they are not BOTH independent and identically distributed. (c) We are sampling without replacement. If we sampled with replacement, Y and Y 2 would be independent (we would put the rst voter back, so the probability of the second voter being Democrat would not depend whether the rst was a Democrat).

(d) We need to calculate the joint probabilities of Y and Y 2. P (Y 2 =, Y = ) = P (Y 2 = Y = )P (Y = ) = (6999/9999)(7000/0000) = 0.48998 P (Y 2 = 0, Y = ) = P (Y 2 = 0 Y = )P (Y = ) = (2999/9999)(7000/0000) = 0.20996 0.2002 P (Y 2 =, Y = 0) = P (Y 2 = Y = 0)P (Y = 0) = (7000/9999)(3000/0000) = 0.2002 P (Y 2 = 0, Y = 0) = P (Y 2 = 0 Y = 0)P (Y = 0) = (2999/9999)(3000/0000) = 0.08998 Notice that the numbers are so large that it really depends on how we round the answers. Y 2 0 p Y (y ) Y 0 0.08998 0.2002 0.30 0.2002 0.48998 0.70 p Y2 (y 2 ) 0.30 0.70 The conditional probabilities are: P (Y 2 = Y = ) = P (Y 2 =, Y = ) P (Y = ) = (6999/9999) = 0.699969 P (Y 2 = Y = 0) = P (Y 2 =, Y = 0) P (Y = ) = (7000/9999) = 0.700070 The probabilities do depend on what we observe for Y but notice that they are very close! The data is approximately i.i.d..

Question # 8. Covariance, Correlation, Independence and Identical Distributions W 5 5 p V (v) V 5 0.05 0.05 0.0 5 0.45 0.45 0.90 p W (w) 0.50 0.50 X 5 5 p Y (y) Y 5 0.45 0.05 0.50 5 0.05 0.45 0.50 p X (x) 0.50 0.50 (a) First, we need to compute the means of both X and Y using the marginal distributions. E[X] = 0.5 5 + 0.5 5 = 0 E[Y ] = 0.5 5 + 0.5 5 = 0 The covariance between (X, Y ) is cov(x, Y ) = 0.45 (5 0) (5 0) + 0.05 (5 0) (5 0) +0.05 (5 0) (5 0) + 0.45 (5 0) (5 0) = 0.45 25 0.05 25 0.05 25 + 0.45 25 = 20 (b) First, we need to compute the means of both W and V using the marginal distributions. E[W ] = 0.5 5 + 0.5 5 = 0 E[V ] = 0. 5 + 0.9 5 = 4

The covariance between (W, V ) is cov(w, V ) = 0.05 (5 0) (5 4) + 0.05 (5 0) (5 4) +0.45 (5 0) (5 4) + 0.45 (5 0) (5 4) = 0.05 45 0.05 45 0.45 5 + 0.45 5 = 0 (c) If two random variables are independent, they ALWAYS have zero covariance. From part (a) we saw that σ XY is not zero and therefore X and Y are NOT independent. (d) Be careful here! Two random variables with zero covariance are NOT always independent (see the next question!!). We need to calculate the conditional probabilities. P (W = 5 V = 5) = P (W = 5 V = 5) = P (W = 5, V = 5) P (V = 5) = (0.05/.) = 0.5 P (W = 5, V = 5) P (V = 5) = (0.45/0.9) = 0.5 The conditionals are the same, so W and V are independent. (e) X and Y are identically distributed: they take on the same values and the marginal probabilities are the same. They are NOT i.i.d. due to part (c). (f ) To compute the correlation between X and Y, we can use the formula: ρ X,Y = cov(x, Y ) σ X σ Y From part (a), we know that cov(x, Y ) = 20. First, we need to compute the standard deviations of X and Y. From part (e), we know that they are identically distributed so they

have the same mean and the same variance (standard deviations). V [X] = 0.5 (5 0) 2 + 0.5 (5 0) 2 = 0.5 25 + 0.5 25 = 25 σ x = σ y = 25 = 5 This implies that ρ X,Y = cov(x, Y ) σ X σ Y = 20 25 = 4/5

Question # 9. Covariance, Correlation, Independence and Identical Distributions (a) First, we need to compute the means of both X and Y using the marginal distributions. The marginal distributions are X - 0 p Y (y) Y 0 0 3 0 3 3 0 3 p X (x) 3 3 3 2 3 and therefore we get E[X] = 3 + 3 0 + 3 = 0 E[Y ] = 3 0 + 2 3 = 2 3 The covariance between (X, Y ) is (here in the calculations we only take into consideration those combinations with non-zero joint probability) cov(x, Y ) = 0.333 (0 0) (0 0.667) + 0.333 ( 0) ( 0.667) = 0 +0.333 ( 0) ( 0.667)

(b) The conditional probabilities can be calculated as: P (Y =, X = ) P (Y = X = ) = P (X = ) = /3 /3 = P (Y =, X = ) P (Y = 0 X = ) = P (X = ) = 0 /3 = 0 P (Y =, X = ) P (Y = X = 0) = P (X = ) = 0 /3 = 0 P (Y =, X = ) P (Y = 0 X = 0) = P (X = ) = /3 /3 = P (Y =, X = ) P (Y = X = ) = P (X = ) = /3 /3 = P (Y =, X = ) P (Y = 0 X = ) = P (X = ) = 0 /3 = 0 (c) No. The marginal and conditional probabilities are not the same. These two random variables have zero covariance but are NOT independent. (d) Take another look at Y and X. There is a nonlinear relationship between these variables: Y = X 2. Remember, covariance and correlation only measure linear relationships. Here X and Y are related but not linearly. The point of this question is to drive home the message that: If X and Y are independent, covariance is ALWAYS zero. If covariance is zero, X and Y are NOT ALWAYS independent.

Question # 0. Expected Value and Variance of Linear Combinations First, recognize that R f is not a random variable. This means that the asset always has a return of 0.02 no matter what. (a) Here, we can apply our formulas for expectations of linearly related random variables. E[P ] = E[.4 R f +.6 R 3 ] =.4 R f +.6 E[R 3 ] =.4.02 +.6.5 = 0.098 (b) Here, we can apply our formulas for the variance of linearly related random variables. V [P ] = V [.4 R f +.6 R 3 ] = (.6) 2 V [R 3 ] = (0.36) (0.0225) = 0.008 It is important to note that R f drops out from the calculation because it is not a random variable. (c) R P,R 3 =. The random variables P and R 3 are linearly related. Therefore, their correlation is. (d) R P,R 2 = 0. The correlation between R 3 and R 2 is zero because these random variables are independent. Taking linear combinations does not change the correlation (Remember problem # 3 part (g) of Homework # ) unless you multiply by a negative number which changes the sign.

(e) Here, we can apply our formulas for expectations of linearly related random variables. E[P 2 ] = E[0.2 R f + 0.4 R + 0.4R 2 ] = 0.2 R f + 0.4 E[R ] + 0.4 E[R 2 ] = 0.2.02 + 0.4 0.05 + 0.4 0.0 = 0.004 + 0.02 + 0.04 = 0.064 (f ) Here, we can apply our formulas for the variance of linearly related random variables. V [P 2 ] = V [0.2 R f + 0.4 R + 0.4R 2 ] = (0.4) 2 V [R ] + (0.4) 2 V [R 2 ] + 2(0.4)(0.4)corr R,R 2 σ R σ R2 = 0.6 (0.05) 2 + 0.6 (0.0) 2 + 2(0.4)(0.4)(0.5)(0.05)(0.) = 0.0004 + 0.006 + 0.0008 = 0.0028 (g) P 2 is a linear combination of R and R 2, which are both independent of R 3. So corr(p 2, R 3 ) = 0. We have not formally shown this but it should make intuitive sense.

Question #. Expectation and Variance of Linear Combinations (a) For each random variable i, we have E[Y i ] = E[W i ] = and V [Y i ] = V [W i ] = 0. E[Z] = E[Y + Y 2 + Y 3 ] = E[Y ] + E[Y 2 ] + E[Y 3 ] = 3 and V ar[z] = V ar[y + Y 2 + Y 3 ] = V ar[y ] + V ar[y 2 ] + V ar[y 3 ] = 30 In the last part, the variance of the sum is equal to the sum of the variances because each Y i is independent. (b) We just use our formulas again E[V ] = E[3W ] = 3E[W ] = and V ar[v ] = V ar[3w ] = 9V ar[w ] = 90 Tripling your bet is more protable on average but much riskier. (c) This is simple if you realize that the mean of a sum is the sum of the means, and since the W 's are independent, the variance of the sum is the sum of the variances. So since A = 2 T,

E[T ] = E[A] = 0 and V ar[t ] = 20, V ar[a] = 5. (d) [ ] E[T ] = E W i = i= E [W i ] = 0 i= and [ ] V ar[t ] = V ar W i = = i= V ar [W i ] i= 0 = 0n i= and E[A] = E [ n ] W i = n i= E [W i ] = 0 i= and V ar[a] = V ar = = [ n V ar i= i= ] W i i= [ n W i ] 0 0 = n2 n (e) E[X] = E = n = n = µ [ n i= ] X i i= µ i= E [X i ]

and V ar[x] = V ar = = [ n V ar i= i= ] X i i= [ n X i ] n 2 σ2 = σ2 n