Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages

Lecture 7: Chapter 7. Sums of Random Variables and Long-Term Averages ELEC206 Probability and Random Processes, Fall 2014 Gil-Jin Jang gjang@knu.ac.kr School of EE, KNU page 1 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Overview 7.1 Sums of Random Variables (7.1.1) 7.2 The Sample Mean and the Laws of Large Numbers Review of 4.6 The Markov and Chebyshev Inequalities 7.3 The Central Limit Theorem (7.3.1) Many problems involve counting the number of vent occurrences, measuring cumulative effects, or computing arithmetic averages in a series of measurements. These probables can be reduced to the problem of finding, exactly or approximately, the distribution of a random variable that consists of the sum of n independent, identically distributed random variables. We investigate sums of random variables and their properties as n becomes large. page 2 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

7.1 Sums of Random Variables Let S n = X 1 +X 2 + +X n. 7.1.1 Mean and Variance of Sums of Random Variables Regardless of statistical dependence, the expected value of a sum of n random variables is equal to the sum of the expected values: (proof: see Example 5.24) E[S n ] = E[X 1 +X 2 + +X n ] = E[X 1 ]+E[X 2 ]+ +E[X n ] Example 7.1 Find the variance of Z = X +Y. VAR[Z] = E[(Z m Z ) 2 ] = E[{(X +Y) (m X +m Y )} 2 ] = E[{(X m X )+(Y m Y )} 2 ]... = VAR[X]+VAR[Y]+2 COV(X,Y) page 3 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Generalisation to many RVs Variance of the sum ofnrvs NOT equal to the sum of individual variances: VAR[S n ] = E = = j=1k=1 (X j E[X j ]) j=1 (X k E[X k ]) k=1 E[(X j E[X j ])(X k E[X k ])] = VAR[X k ]+ k=1 j=1k=1,k j COV(X j,x k ) j=1k=1 COV(X j,x k ) Variance of the sum ofn INDEPENDENT RVs equal to the sum of variances: VAR[S n ] = VAR[X 1 ]+ +VAR[X n ] Example 7.2 Sum of iid RVs if X 1, X 2,..., X n are iid (independent, identically distributed), then E[S n ] = E[X 1 ]+ +E[X n ] = nµ VAR[S n ] = nvar[x j ] = nσ 2 page 4 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

7.2 The Sample Mean and the Laws... Let X be a RV for which the mean, E[X] = µ, is unknown. Let X 1,..., X n denote n independent, repeated measurements of X; i.e., X j s are iid RVs with the same pdf as X. The sample mean is a RV to estimate the true mean E[X]. M n = 1 X j. n We will assess the effectiveness of M n as an estimator for E[X] by computing the expected value and variance of M n, and investigating the behaviour of M n as n becomes large. This is very important in real situations: How many measurements are necessary to obtain a reliable mean? We made n measurements and take the average of them as the mean value. Can we be confident of our mean? j=1 page 5 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Sample Mean The sample mean is itself a RV, so it will exhibit random variation The properties (conditions) of a good estimator 1. On the average, it should give the correct value: E[M n ] = µ 2. Should not vary too much: VAR[M n ] = E[(M n µ) 2 ] is small enough. The sample mean is an unbiased estimator for µ: E[M n ] = E 1 n j=1 X j = 1 n E[X j ] = 1 n nµ = µ j=1 since E[X j ] = E[X] = µ, j. The estimator is not biased, but centered at the true value. page 6 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Variance of Sample Mean Compute the variance of M n with its mean µ VAR[M n ] = E[(M n µ) 2 ] = E[(M n E[M n ]) 2 ] M n = 1 n n j=1 X j = 1 n S n where X j s are iid RVs. From Example 7.2, VAR[S n ] = nvar[x j ] = nσ 2, and hence VAR[M n ] = 1 n 2VAR[S n] = nσ2 n = σ2 2 n The variance of the sample mean approaches zero as the number of samples is increased. In other words, the probability of the sample mean being close to true mean approaches one as n becomes very large. page 7 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

4.6 Markov and Chebyshev Inequalities Markov inequality: The mean and variance of a random variable X helps obtain bound for probabilities of the form P[ X t]. For X nonnegative P[X a] E[X] E[X] = a 0 tf X (t)dt+ tf X (t)dt a a a tf X (t)dt a af X (t)dt = ap[x a] Chebyshev inequality: Suppose that the mean E[X] = m and the variance VAR[X] = σ 2. Then P[ X m a] σ2 a 2 proof Let D 2 = (X m) 2 be the squared deviation from the mean. Apply the Markov inequality to D 2, P[D 2 a 2 ] E[(X m)2 ] a 2 In the case of σ 2 = 0, then it implies that P[X = m] = 1. = σ2 a 2 page 8 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Applying Chebyshev Inequality Keep in mind that E[M n ] = µ and VAR[M n ] = σ2 n inequality can be formulated by, then the Chebyshev P [ M n E[M n ] ε] VAR[M n] ε 2 P [ M n µ ε] σ2 nε 2 P [ M n µ < ε] 1 σ2 nε 2 If the true variance σ 2 is known, we can select the number of samples n so that M n is within ε and the true mean with probability 1 δ or greater. page 9 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Example 7.9 A voltage of (unknown) constant value is to be measured. Each measurement X j is the sum of the desired voltage v (true mean) and a noise voltage N j of zero mean and standard deviation of 1 microvolt (µv ): X j = v +N j Assume that the noise voltages are independent RVs. How many measurements are required so that the probability that M n is within ε = 1µV of the true mean at least 0.99? Solution E[X j ] = E[v +N j ] = v +E[N j ] = 0 VAR[X j ] = VAR[v +N j ] = VAR[N j ] = 1 for M n = 1 X j, P [ M n µ X < ε] 1 σ2 X n nε 2 j=1 P [ M n v < ε] 1 1 n = 0.99 n = 100 page 10 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Laws of Large Numbers Weak Law of Large Numbers Let X 1, X 2,... be a sequence of iid RVs with finite mean E[X] = µ, then for ε > 0, and even if the variance of the X j s does not exist, lim P [ M σ2 n µ < ε] = 1 (= lim 1 n n nε 2) Strong Law of Large Numbers Let X 1, X 2,... be a sequence of iid RVs with finite mean E[X] = µ and finite variance, then (proof: beyond sophomore level) [ ] P lim M n = µ = 1 n page 11 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Example 7.10 In order to estimate the probability of an event A, a sequence of Bernoulli trials is carried out and the relative frequency of A is observed. How large should n be in order to have a 0.95 probability that the relative frequency is within 0.01 of p = P[A]? Solution Let X = I A be the indicator function of the occurrence of A. Since p = P[A] is the occurrence probability of the Bernoulli trial, E[X] = E[I A ] = p, VAR[X] = p(1 p). The sample mean for X is: M n = 1 n X k = 1 n k=1 I A,k = f A (n) Since M n is an estimator for E[X] = p, f A (n) is also an estimator for p. By applying Chebyshev Inequality, k=1 P[ f A (n) p ε] σ2 X nε 2 1 4nε 2 P[ f A (n) p < ε] 1 1 4nε 2 σ 2 X =VAR[X]=p(1 p)= (p 1 2 )2 + 1 4 1 4 1 1 4nε 2 = 0.95 n = 1/{(1 0.95) 4ε2 } = 50,000 page 12 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

7.3 The Central Limit Theorem Notion Let X 1, X 2,... be a sequence of RVs with finite mean µ and finite variance σ 2. Regardless of the probability distribution of X j, the sum of the first n RVs, S n = X 1 + +X n approaches a Gaussian RV with mean E[S n ] = ne[x j ] = nµ and variance VAR[S n ] = nvar[x j ] = nσ 2, as n becomes large. Mathematical formulation Standardise S n by Z n = S n nµ σ n then In terms of M n = 1 n lim P[Z n z] = 1 z e x2 /2 dx n 2π n j=1 X j, Z n = n M n nµ σ Importance of Gaussian RV The central limit theorem explains why the Gaussian RV appears in so many diverse applications, since the real observations are from the mixture of so many random variables. page 13 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Example 7.11 7.11 Suppose that orders at a restaurant are iid random variables with mean µ = $8 and standard deviation σ = $2. 1. Estimate the probability that the first 100 customers spend a total of more than $840. S 100 = 100 i=1 X i, Z 100 = S 100 nµ nσ 2 P[S 100 > 840] = P[Z 100 > 840 800 20 = S 100 100 8 100 2 2 = S 100 800 20 ] Q(2) = 2.28 10 2 2. Estimate the probability that the first 100 customers spend a total of between $780 and $820. P[780 S 100 820] = P[ 1 Z 100 1] = Φ(1) Φ( 1) = 1 Q(1) Q(1) = 1 2Q(1) 0.682 page 14 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages

Example 7.12 7.12 In Example 7.11, after how many orders can we be 90% sure that the total spent by all customers is more than $1000? P[S n > 1000] 0.9 [ P Z n > 1000 8n ] ( ) 1000 8n 2 = Q n 2 n ( Q 1000 8n ) 2 1 0.9 = 0.1 n 1000 8n 2 n n 129 Q 1 (0.1) 1.2815 0.9 page 15 / 15 Chapter 7. Sums of Random Variables and Long-Term Averages