Expectations and Variance - PDF Free Download

4. Model parameters and their estimates 4.1 Expected Value and Conditional Expected Value 4. The Variance 4.3 Population vs Sample Quantities 4.4 Mean and Variance of a Linear Combination 4.5 The Covariance and Correlation for RVs 4.6 Mean and Variance of a Linear Combination (again) 4.7 The Central Limit Theorem Expectations and Variance What is a typical outcome for a model? What should we expect to happen? How spread out are the outcomes? In this section we introduce the mean and variance of random variables. 1

Definition: Given a discrete random variable with values x i, the expected value (mean) of is E xp i x i all x ( ) ( ) i The expected value is simply a weighted average of the possible values can assume where the weights are the long run frequency of occurrence or the probabilities. This is similar, but fundamentally different from the sample average that we talked about before. Here the x i are possible values for the random variable and P x is a probability! Let R be a random variable denoting the return on an asset. p(r) 0.5 0.4 0.3 0. 0.1 0.05 0.10 r 0.15 E(R)=.1*(.05)+.5*(.1)+.4*(.15)=.115 More likely values get more weight.

Hence the expected value summarizes a typical outcome. Clearly it is analogous to the sample mean telling us about the center or typical value associated with a data set. The link between the sample mean and the expected value of a random variable can be made explicit. 4.1 Relationship between Expected Values and sample averages a first look. Suppose we toss two coins ten times. Each time we record the number of heads. x 1 0 1 0 1 0 0 What is the average value? Mean of x = 0.9= 4 3 3 10 10 10 40 31 3 /10 0 1 3

Now suppose we toss them 1000 times: x 1 1 1 0 1 1 1 0 0 1 0 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 1 1 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 1 1 0 1 1 1 0 1 1 1 1 1 0 1 1 1 0 1 1 1 1 0 0 0 1 0 1 1 0 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 0 1 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 0 0 1 1 1 0 0 1 0 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 1 1 0 1 1 0 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 0 1 1 0 1 0 1 0 0 1 0 1 1 1 1 1 1 0 1 1 1 1 0 1 0 0 1 0 1 1 1 0 0 0 1 0 1 1 0 0 1 0 1 1 0 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1 1 1 1 0 1 0 1 0 1 1 0 1 0 0 1 1 1 0 0 0 0 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 1 0 0 1 1 1 1 1 0 0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 0 1 1 1 0 1 0 1 1 0 1 1 1 1 0 1 0 1 1 1 0 1 1 0 0 1 1 0 1 1 1 0 1 1 0 0 0 1 0 1 0 1 1 1 0 0 0 1 1 1 0 1 1 0 1 1 0 0 1 1 1 1 1 1 0 0 0 0 0 1 1 1 1 0 1 0 1 1 1 1 1 1 1 0 1 1 1 1 1 1 0 0 1 1 1 0 1 1 0 0 1 1 1 0 0 0 0 1 1 1 1 1 1 0 0 1 1 1 1 1 1 0 1 1 0 0 1 1 0 0 0 1 0 0 0 0 1 1 0 1 0 0 1 1 1 1 1 1 1 0 0 0 0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 0 1 1 0 1 1 1 1 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 1 0 1 0 1 1 1 0 1 1 0 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 0 0 0 1 1 1 0 0 1 0 0 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 0 0 1 0 0 1 1 1 1 1 0 0 1 0 1 0 1 0 1 1 1 1 1 1 1 1 1 0 1 1 0 1 1 0 1 1 1 1 0 0 1 1 1 1 1 0 0 0 1 1 1 0 1 1 1 0 0 0 0 0 0 1 1 1 1 0 1 0 1 1 1 1 1 0 1 1 1 1 1 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1 0 1 0 0 0 1 0 1 1 0 1 0 0 1 0 0 1 1 1 1 0 1 1 0 1 0 1 0 1 1 1 1 1 0 0 1 0 1 0 1 0 0 1 1 1 1 0 0 1 0 0 1 1 1 1 1 what is the mean?? What s the mean? Let, n0 n1 n be the number of 0 s, 1 s and s respectively. Then the average would be n 0n1n n n n n 0 1 n n n 0 1 0 1 4

which is the same as x n0 n1 n 0 1 n n n The values are actually iid draws from the distribution: Pr(x) x 0.5 0 0.50 1 0.5 so, for n large, we should have: n0 n1 n.5.5.5 n n n so the average should be about:.5(0) +.5(1) +.5() = 1 5

The actual mean is: Mean of x = 1.0110 With a very, very,very,...large number of tosses we would expect the sample mean (the mean of the numbers) to be very close to 1. We can think of p (0) 0p (1) 1p () 1 as the long run average of iid draws. Roulette Example: A roulette wheel has the numbers 1 through 36 on it, as well as 0 and 00. Say you bet $1 that an odd number comes up. Let =1 denotes the event that you win (with probability = 18/38) and = -1 denotes the event you loose (with probability = 0/38). The expected value of is E( ) 1 18 ( 1) 0 1 38 38 19 Thus, in the long run, your expected loss (and the casino s expected gain) is about $.05. 6

Suppose the returns associated with an asset are iid N(.01,.04). This tells us a lot. It tells us what the uncertainty is with the next return and it tells us, on average, what the return will be in the long run! The Expected Value of a Continuous Random Variable The expected value of a continuous random variable is: E xf xdx The basic intuition for what it means is the same for continuous as discrete. It is basically a weighted average of all the possible outcomes. We still denote it by x or E() 7

Conditional expectations Previously, we discussed unconditional and conditional distributions. Consider the following model: Stocks 0.05 0.05 0.1 0.0 0. 0.05 0.05 0.3 Bonds 0.01 0.05 0. 0.1 0.35 0.03 0.05 0.1 0. 0.35 0.3 0.35 0.35 Expected value of stock return The expected value for stocks is obtained by taking a weighted average of possible outcomes where the weights are the unconditional probabilities. It answers the question what should we expect to happen to stocks regardless of what happened to bonds. Stocks Pr(Stocks) Unconditional expectation 0.05 0.3 0.05 0.35 0.1 0.35 E(Stocks) 0.0375 8

Conditional expected value We could ask a different question. On months that the bond market (say) goes up by.03, what is the expected value for stocks? Answers to this question should use the conditional distribution for stocks on months that the bond went up by.03. We differentiate between conditional and unconditional by using the vertical bar E(Y =x). Stocks Pr(stocks Bonds=.03) Conditional (on Bonds=03) expectation 0.05 0.14 0.05 0.9 0.1 0.57 E(Stocks Bonds=.03) 0.064 4. Variance of a Random Variable The variance and standard deviation are measures of the dispersion of a random variable around its mean. The means are the same, what is the difference between these random variables? 0.4 0.4 0.3 0.3 C14 C1 0. 0. 0.1 0.1 1 p6 3 4 1 p5 3 4 9

Technically, the variance of a random variable is the expected value of the squared deviation of a random variable from its mean. The general mathematical formula is Var( ) E[( ) ] E( ) For discrete random variables, this simplifies to Var( ) E[( ) ] ( x i ) P( x i ) all x i weighted average of deviations from the mean Coin Example: We toss a coin two times and let be the number of Heads. E( ) 0P( 0) 1P ( 1) P( ) 0(. 5) 1(. 5) (. 5) 1 x ( x ) P( x) ( x ) P( x) 0 (0-1).5.5 1 (1-1).50 0 ( -1).5.5 0.5 10

Example: Suppose that the probability distribution function for the number of errors,, on pages from business textbooks is x 0 1 P(=x).81.17.0 Find E() and Var(). 0.81 0.17 0.0 N 0 1 E(N) 0 0.17 0.04= 0.1 E(N^) 0 0.17 0.08= 0.5 Var= 0.5 0.0441= 0.059 0.45376 std= sqrt(.059) Example: The mean and variance of a Bernoulli r.v. Suppose ~Bernoulli(p). x p(x) E() = p*1 + (1-p)*0 = p : 1 p 0 1-p Var() = p*(1-p) + (1-p)*(0-p) = p(1-p)*[(1-p) + p] = p(1-p) For what value of p is the mean smallest, biggest?? For what value of p is the variance smallest, biggest?? 11

4.3 Population vs Sample Quantities Sample mean vs. Expected value (or mean of a r.v.) dataset: an observed set of values random variable: a model for an uncertain quantity sample mean of a variable in our data: n 1 x x i n i1 The average of observed values in our dataset. expected value (or mean) of a r.v.: E() p(x)x all x Average of possible values the r.v. can take, weighted by probabilities. sample population E() or s Var() or We will estimate the population quantity with the sample quantity. s How exactly does the sample size impact the precision of the estimate? What else effects the precision? 1

4.4 Expectation and Variance of functions Example: A contractor estimates the probabilities for the time (in number of days) required to complete a certain type of job as follows: Let T denote the time to completion. t p(t) 1.05.0 3.35 4.30 5.10 Note that: E(T) =.05*1 +.0* +.35*3 +.30*4 +.10*5 = 3. Var(T) =.05*(1-3.) +.0*(-3.) +.35(3-3.) +.30(4-3.) +.10(5-3.) = 1.06 Suppose that the project costs $0,000 plus $3,000 per day to complete. How should you price this project? The longer it takes to complete the job, the greater the cost. There is a fixed cost of $0,000 and an additional (variable) cost of $3,000 per day. Let C denote the cost in thousands of dollars. Then, C = 0,000 + 3,000*T Before the project is started, both T and C are unknown and hence random variables. C depends on T and since T is unknown, so is C. Since the values of C are determined by the values of T, the probability that C takes any given value is determined by the probability that T takes the value that generates C. 13

C = 0,000 + 3,000T We could write out the probability distribution of C and solve For its mean and variance t p(t) T: 1.05.0 3.35 4.30 5.10 c p(c) C: 3,000.05 6,000.0 9,000.35 3,000.30 35,000.10 E(C) =.05*3,000 +.0*6,000 +.35*9,000+.30*3,000 +.10*35,000 = 9,600 Var(C) =.05*(3,000-9,600) +.0*(6,000-9,600) +.35(9,000-9,600) +.30(3,000-9,600) +.10(35,000-9,600) = 9,540,000 It would be nice to be able to figure out the mean and variance of C without having to go to all this trouble. Let be a random variable with mean and variance. Let c 0 and c 1 be any constant fixed numbers. Define the random variable W=c 0 +c 1. Then W E( W) c0ce 1 ( ) c0c1 Var( W) Var ( c c ) c W 0 1 1 14

Back to our cost and time example. BUT if we already know E(T) and Var(T), it is much easier to use the formulas! C = 0,000 + 3,000*T E(C) = 0,000 + 3,000*E(T) = 0,000 + 3,000(3.) = 9,600 Var(C) = (3,000 )*Var(T) = 9,000,000*(1.06) = 9,540,000 C = 3,000*sqrt(1.06) = 3,900 Expected Value of a non-linear function of a random variable Sometimes we will be interested in the expected value of some function of a random variable. For example, let W be the prize a game show contestant ends up with. Example: Deal or No Deal George has cases worth $5, $400, $10,000, and $1,000,000 remaining (4 outcomes, equally likely) The banker s offer is $189,000. E(W) =.5*5 +.5*400 +.5*10,000 +.5*1,000,000 = $5,601.30 This deal may not look very attractive 15

BUT, this assumes people choose based on expected values. Economists believe in diminishing marginal utility of income. The more wealth you have, the less utility you get from making $1. This implies people are risk averse. This is often modeled with a utility of wealth like U W. To compute the expected value E[f()], just take f of each possible outcome, then multiply by probability and add. Utility 5 0 15 10 What is George s expected utility? E W.5* 5.5* 400.5* 10, 000.5* 1, 000, 000 88.56 5 0 0 00 400 600 Wealth ($) Utility of taking the offer: 189,000 = 434.74!! Aside: Let s take this a step further. Could wealth affect your attitude toward risk? If you re Donald Trump (he has actually appeared several times on the show, but not as a contestant!), might you make a different decision? Say Donald is worth $50 million. What would Donald s expected utility be? E W.5* 50Mil5.5* 50Mil400.5* 50Mil 10, 000.5* 51Mil 7089 Utility of taking the offer: 50,189, 000 = 7084 16

4.5 Covariance and Correlation for RVs Suppose we have the bivariate distribution of a pair of random variables (,Y). We might ask Are and Y related? We define the covariance and correlation between the RVs to summarize their linear relationship. The covariance for a bivariate discrete distribution is given by: cov(,y) p(x, y)(x )(y ) Y Y all(x,y,) This is just the average product of the deviation of from its mean and the deviation of Y from its mean, weighted by the joint probabilities p(x,y). 17

The correlation between random variables (discrete or continuous) is Y Y Y Like sample correlation, the correlation between two random variables equals their covariance divided by their standard deviations. This creates a unit free measure of the relationship between and Y. : the basic facts 1 1 If is close to 1, that means there is a line, with positive slope, such that pairs (x,y) are likely to fall close to it, If is close to 1, same thing, but the line has a negative slope. 18

Example =.1 Y =.1 =.05 Y =.05 E( ( ) (Y Y )) = Y.05.15.05.4.1.15.1.4 IN ECEL, use the formula: =.4*(-.05)*(-.05) +.1*(-.05)*(.05) +.1*(.05)*(-.05) +.4*(.05)*(.05) ECEL gives us: 0.00150000 The correlation is: IN ECEL, use the formula: =.0015/(.05*.05) ECEL gives us: 0.600000 What line is the pair of numbers likely to be close to? Consider the scatter plot of the outcomes where the dot sizes are proportional to the probability, what is the relationship?.15 Y.05.05.15 19

Example Lets compute the cov. Y 0 1 0.5.5 1.5.5 E(( )( Y )) Y. 5(. 5)(. 5). 5(. 5)(. 5). 5(. 5)(. 5). 5(. 5)(. 5) 0 The covariance is 0 and so is the correlation. Independence and Correlation Suppose two rv s are independent. That means they have nothing to do with each other. That means the have nothing to do with each other linearly. That means the correlation is 0. If and Y are independent then Y 0 cov(,y) 0

4.6 Mean and Variance of a Linear Combination Suppose Y c c c c c then, 0 1 1 3 3 Y c c c c 0 1 3 ck 1 3 k k k y c1 x c x... ck 1 x k all the possible combinations of covariance terms With just two rv s we have: Y c0 c11c c c c Y 0 1 1 c c c c y 1 x x 1 1 1 1

Example Suppose that the assets of a company consist of two retail stores and cash. The cash is held in the form of t bills earning a fixed return of.5 million dollars per year. Let 1 be denote the profits from the first retail store over the next year and let denote the profits from the second retail store over the next year. What is the company s profits over the next year in terms of 1 and? The mean profits for retail store 1 is 1 million per year and 17 million per year for retail store. The standard deviation of annual profits for store 1 is 6 million and 8 million for store. The covariance between store 1 and store profits is 0 (corr=.4167).

What are the mean and standard deviation of profits for the company? Example Suppose: Let P =.5+.5Y Y Y Y.05.1.01.01 0 We know nothing else about and Y. They might be discrete or continuous. E(P)=.5*.05+.5*.1 = 0.075 Var(P) =.5*.01 +.5*.01 + 0 =.005 The variance of the Portfolio is half that of the inputs. 3

Example Suppose: Let P =.5+.5Y E(P)=.5*.05+.5*.1 = 0.075 Y Y Y. 05. 1. 01. 01. 9 Var(P) =.5*.01 +.5*.01 (*.5*.5)*(.1*.1*.9) =.0005 This time when when is up the other tends to be down so there is substantial variance reduction. Example Suppose: Let P =.5+.5Y Y Y Y. 05. 1. 01. 01. 9 We know nothing else about and Y. They might be discrete or continuous or whatever. E(P)=.5*.05+.5*.1 = 0.075 Var(P) =.5*.01 +.5*.01 + (*.5*.5)*(.1*.1*.9) = 0.0095 This time there is little variance reduction because the large positive correlation tells us that and Y tend to move up and down together. 4

Example 1 3 1 3 1 1 3 3 1 1 3 3. 05. 1. 15. 01. 009. 008. 3... 00846. 001789. 001697 Y.. 5. 3 1 3 y =.*(.05) +.5*(.1) +.3*(.15 )= 0.105000 Y.*.*(.01) +.5*.5*(.009) +.3*.3*(.008) + *.*.5*.00846 + *.*.3*(.001789) + *.5*.3*.001697 = 0.004336 Special Case Suppose n s are all uncorrelated. Y then E Y E E... E and Var 1... 1 n Y Var Var... Var 1 n n 5

Example What is the variance of a Binomial random variable? Y~B(n,p) Y 1 n are iid Bernoulli(p). E(Y) E( ) 1 n E( ) E( ) E( ) np 1 n Var( ) 1 n Var( ) Var( ) Var( ) np(1p) 1 n For Y~B(n,p), E(Y) = np and Var(Y) = np(1 p) 6

4.7 The Central Limit Theorem Often, the normal distribution is a good approximation to distributions of quantities we observe in the real world. One reason for this is the central limit theorem. This says that if a RV is a combination of a bunch of independent RVs, then its distribution is approximately normal. If we know it is normal, then all we need is the mean and variance! Let 1,,, n be n iid random variables Let Y= 1 + + + n The CLT says that Y will be approximately Normally distributed for large n. 7

Let the random variables i denote n iid Bernoulli random variables with p=.. I.e. they can all take the value 1 with probability.. First take the case where n= so we have just 1 and. Let Y= 1 +. (y is a linear combination of x 1 and ). 0.4 Binomial. All have p=.. n=5 p(x5) 0.3 0. 0.1 0.0 0 1 3 4 5 x5 0.15 n=50 p(x50) 0.10 0.05 0.00 0 10 0 30 40 50 x50 0.10 n=100 p(x100) 0.05 0.00 0 50 100 x100 8

This one has the normal curve with =np and =np(1 p) drawn over it. The normal curve fits the probabilities well. 0.10 p(x100) 0.05 0.00 0 50 100 x100 The fact that that the binomial distribution can be approximated by a Normal distribution is the result of the central limit theorem. IMPORTANT: This example was for the case of sums of iid Bernoulli random variables. The result actually holds for taking linear combinations of any iid random variables. 9

Example: Peanut Butter and the CLT A grocery store clerk is responsible for ordering peanut butter. Shipments come once every 50 days. The clerk know that the average daily number of jars of peanut butter sold is 5 and the variance is 98 and that the number of jars sold on any day is unrelated to the number sold on any other day. Lets denote the number of jars sold each day for the next 50 days as 1,,, 50 We have i are iid with mean 5 and variance of 98. Let Y denote the total number of jars sold over a 50 day period. What is Y in terms of the i? 50 Y i1 What is the mean and variance of Y? 50*5= 150 50*98= 4900 SQRT(var) 70 i 30

What is the distribution of Y? If the clerk orders 1390 jars to cover the next 50 days what is the probability of a stock out? There is more to come on the CLT. Stay tuned. 31