Topic 3 Random variables, expectation, and variance, II

Similar documents
Sets. Review of basic probability. Tuples. E = {all even integers} S = {x E : x is a multiple of 3} CSE 101. I = [0, 1] = {x : 0 x 1}

Random variables (discrete)

X = X X n, + X 2

Discrete Random Variables

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Random Variables. Saravanan Vijayakumaran Department of Electrical Engineering Indian Institute of Technology Bombay

Notes on Discrete Probability

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Overview. CSE 21 Day 5. Image/Coimage. Monotonic Lists. Functions Probabilistic analysis

Homework 4 Solution, due July 23

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Lecture 10. Variance and standard deviation

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Lecture 4: Probability and Discrete Random Variables

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

STAT 430/510 Probability Lecture 7: Random Variable and Expectation

Random Variables. Cumulative Distribution Function (CDF) Amappingthattransformstheeventstotherealline.

Joint Distribution of Two or More Random Variables

Notes on Mathematics Groups

Review of Probability. CS1538: Introduction to Simulations

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

Discrete Probability Refresher

Topic 3: The Expectation of a Random Variable

12 1 = = 1

Week 12-13: Discrete Probability

1 Basic continuous random variable problems

UCSD CSE 21, Spring 2014 [Section B00] Mathematics for Algorithm and System Analysis

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

CS280, Spring 2004: Final

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

Math 105 Course Outline

Lecture Notes 1 Basic Probability. Elements of Probability. Conditional probability. Sequential Calculation of Probability

The Probabilistic Method

Math 493 Final Exam December 01

Great Expectations, or: Expect More, Work Less (2/3/10)

Chapter 2. Some Basic Probability Concepts. 2.1 Experiments, Outcomes and Random Variables

Expectation is linear. So far we saw that E(X + Y ) = E(X) + E(Y ). Let α R. Then,

1 Variance of a Random Variable

CME 106: Review Probability theory

Discrete Probability

1 Basic continuous random variable problems

2. Suppose (X, Y ) is a pair of random variables uniformly distributed over the triangle with vertices (0, 0), (2, 0), (2, 1).

Expectation MATH Expectation. Benjamin V.C. Collins, James A. Swenson MATH 2730

Topics in Discrete Mathematics

EXPECTED VALUE of a RV. corresponds to the average value one would get for the RV when repeating the experiment, =0.

Random Variable. Discrete Random Variable. Continuous Random Variable. Discrete Random Variable. Discrete Probability Distribution

More on Distribution Function

MATH MW Elementary Probability Course Notes Part I: Models and Counting

Lecture 4: Two-point Sampling, Coupon Collector s problem

3 Multiple Discrete Random Variables

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Edexcel past paper questions

Probability and random variables

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

Name: Firas Rassoul-Agha

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

. Find E(V ) and var(v ).

Senior Math Circles November 19, 2008 Probability II

6.041/6.431 Spring 2009 Quiz 1 Wednesday, March 11, 7:30-9:30 PM. SOLUTIONS

Theorem 1.7 [Bayes' Law]: Assume that,,, are mutually disjoint events in the sample space s.t.. Then Pr( )

EXAM. Exam #1. Math 3342 Summer II, July 21, 2000 ANSWERS

Chapter 4. Chapter 4 sections

Exercises with solutions (Set D)

Lecture 2 Binomial and Poisson Probability Distributions

Steve Smith Tuition: Maths Notes

Expected Value 7/7/2006

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Arkansas Tech University MATH 3513: Applied Statistics I Dr. Marcel B. Finan

Debugging Intuition. How to calculate the probability of at least k successes in n trials?

Course: ESO-209 Home Work: 1 Instructor: Debasis Kundu

Sampling Distributions

MATH 3510: PROBABILITY AND STATS June 15, 2011 MIDTERM EXAM

Probability theory for Networks (Part 1) CS 249B: Science of Networks Week 02: Monday, 02/04/08 Daniel Bilar Wellesley College Spring 2008

Example 1. The sample space of an experiment where we flip a pair of coins is denoted by:

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

Alex Psomas: Lecture 17.

Chapter 8: An Introduction to Probability and Statistics

Discrete Probability. Chemistry & Physics. Medicine

Appendix A : Introduction to Probability and stochastic processes

1 INFO Sep 05

Lecture Lecture 5

1. Discrete Distributions

RVs and their probability distributions

STAT 414: Introduction to Probability Theory

Statistical Methods for the Social Sciences, Autumn 2012

Come & Join Us at VUSTUDENTS.net

Intermediate Math Circles November 8, 2017 Probability II

Lecture 8 Sampling Theory

Notice how similar the answers are in i,ii,iii, and iv. Go back and modify your answers so that all the parts look almost identical.

Preliminary Statistics Lecture 3: Probability Models and Distributions (Outline) prelimsoas.webs.com

Notes 12 Autumn 2005

MATH Solutions to Probability Exercises

Nonparametric hypothesis tests and permutation tests

14 - PROBABILITY Page 1 ( Answers at the end of all questions )

Math 510 midterm 3 answers

4/17/2012. NE ( ) # of ways an event can happen NS ( ) # of events in the sample space

Fourier and Stats / Astro Stats and Measurement : Stats Notes

Transcription:

CSE 103: Probability and statistics Fall 2010 Topic 3 Random variables, expectation, and variance, II 3.1 Linearity of expectation If you double each value of X, then you also double its average; that is, E(2X) = 2E(X). Liewise, if you raise each of its values by 1, you will also increase the average by 1; that is, E(X + 1) = E(X) +1. More generally, for any constants a, b, E(aX +b) = ae(x)+b. Another exceptionally useful formula says that the mean value of the sum of variables is simply the sum of their individual means. Formally, for any random variables X, Y, E(X +Y) = E(X)+E(Y). For example, recall our earlier example about two rolls of a die, in which we let X be the sum of the rolls and derived E(X) by first computing Pr(X = x) for all x {2,3,...,12}. Well, now we can do it much more easily: simply write X 1 for the first roll and X 2 for the second roll, so that X = X 1 +X 2. We already now E(X i ) = 3.5, so E(X) = 7. More generally, for any random variables X 1,X 2,...,X n, Some quic examples: E(X 1 + +X n ) = E(X 1 )+ +E(X n ). 1. Roll n dice and let X be the number of sixes. What is E(X)? This time, let X i be 1 if the ith roll is a six, and 0 otherwise. Thus E(X i ) = 1/6, so E(X) = n/6. 2. Toss n coins of bias p and let X be the number of heads. What is E(X)? Let X i be 1 if the ith coin turns up heads, and 0 if it turns up tails. Then E(X i ) = p and since X = X 1 + +X n, we have E(X) = np. 3. Toss n coins of bias p; what is the expected number of times HTH appears in the resulting sequence? Let X i be 1 if there is an occurrence of HTH starting at position i (so 1 i n 2). The total number of such occurrences is X = X 1 + X 2 + + X n 2. Since E(X i ) = p 2 (1 p), we have E(X) = (n 2)p 2 (1 p). 3.1.1 Fixed points of a permutation The fixed points of a permutation are the numbers that remain in their original position. For instance, in the permutation (1,2,3,4,5,6) (6,2,5,4,1,3) the fixed points are 2 and 4. Let X be the number of fixed points in a random permutation of (1,2,...,n); what is E(X)? 3-1

Linearity is very helpful here. Define the random variable X i to be 1 if i is a fixed point, and 0 otherwise. Then E(X i ) = 1/n. Therefore E(X) = E(X 1 + +X n ) = 1. The expected number of fixed points is 1, regardless of n. 3.1.2 Coupon collector, again Recall the setting: each cereal box holds one of action figures (chosen at random), and you want to collect all the figures. What is the expected number of cereal boxes you need to buy? Suppose you eep buying boxes until you get all the figures. Let X i be the number of boxes you buy to get from i 1 distinct figures to i distinct figures. Therefore X = X 1 +X 2 + +X, and of course X 1 = 1. What is E(X i )? Well, you already have i 1 of the figures, so the chance of getting a new figure in a cereal box is ( (i 1))/. Call this p. Therefore, the expected amount of time you have to wait to get a new figure is 1/p: just lie waiting for a coin with bias p to turn up heads. That is, Invoing linearity of expectation, E(X i ) = i+1. E(X) = E(X 1 )+ +E(X ) = + 1 + 2 + + 1 ( = 1+ 1 2 + + 1 ) ln. This confirms our earlier observations about the coupon collector problem: you need to buy about ln boxes. 3.1.3 Balls in bins, again Toss m balls in n bins; what is the expected number of collisions? Let s mae this more precise. For any 1 i < j m, define the random variable X ij to be 1 if balls i and j land in the same bin, and 0 otherwise. Then the number of collisions is defined to be X = X ij. 1 i<j m Since E(X ij ) = 1/n (do you see why?), it follows that the expected number of collisions is E(X) = ( ) m 1 2 n = m(m 1). 2n So if m < 2n, the expected number of collisions is < 1, which means every ball goes into a different bin. This relates bac to the birthday paradox, where m is close to the threshold 2n. 3-2

3.2 Independent random variables Random variables X and Y are independent if Pr(X = x,y = y) = Pr(X = x)pr(y = y) for all x,y. In words, the joint distribution of (X,Y) factors into the product of the individual distributions. This also implies, for instance, that Pr(X = x Y = y) = Pr(X = x). Which of the following pairs (X,Y) are independent? 1. Pic a random card out of a standard dec. Define X to be 1 if it is a heart; and 0 otherwise. Define Y to be 1 if it is a jac, queen, or ing; and 0 otherwise. 2. Toss a fair coin n times, and define X to be the number of heads, and Y to be 1 if the last toss is heads (and 0 otherwise). 3. X and Y tae values in { 1,0,1}, and their joint distribution is given by the following table of probabilities. X Y 1 0 1 1 0.4 0.16 0.24 0 0.05 0.02 0.03 1 0.05 0.02 0.03 If X, Y are independent, they satisfy the following useful product rule: E(XY) = E(X)E(Y). Another useful fact is that f(x) and g(y) must also be independent, for any functions f and g. 3.3 Variance If you need to summarize a probability distribution by a single number, then the mean is a reasonable choice although often the median is better advised (more on this later). But neither the mean nor median captures how spread out the distribution is. Loo at the following two distributions: 100 100 They both have the same expectation, 100, but one is concentrated near the middle while the other is pretty flat. To distinguish between them, we are interested not just in the mean µ = E(X), but also in the typical distance from the mean, E( X µ ). It turns out to be mathematically convenient to wor with the square instead: the variance of X is defined to be var(x) = E((X µ) 2 ) = E((X E(X)) 2 ). In the above example, the distribution on the right has a higher variance that the one on the left. 3-3

3.3.1 Properties of the variance In what follows, tae µ to be E(X). 1. The variance cannot be negative. Since each individual value (X µ) 2 is 0 (since its squared), the average value E((X µ) 2 ) must be 0 as well. 2. var(x) = E(X 2 ) µ 2. This is because var(x) = E((X µ) 2 ) = E(X 2 +µ 2 2µX) = E(X 2 )+E(µ 2 )+E( 2µX) (linearity) = E(X 2 )+µ 2 2µE(X) = E(X 2 )+µ 2 2µ 2 = E(X 2 ) µ 2. 3. For any random variable X, it must be the case that E(X 2 ) (E(X)) 2. This is simply because var(x) = E(X 2 ) (E(X)) 2 0. 4. E( X µ ) var(x). If you apply the previous property to the random variable X µ instead of X, you get E( X µ 2 ) (E( X µ )) 2. Therefore, E( X µ ) E( X µ 2 ) = var(x). The last property tells us that var(x) is a good measure of the typical spread of X: how far it typically lies from its mean. We call this the standard deviation of X. 3.3.2 Examples 1. Suppose you toss a coin with bias p, and let X be 1 if the outcome is heads, or 0 if the outcome is tails. Let s loo at the distribution of X and of X 2. Prob X X 2 p 1 1 1 p 0 0 From this table, E(X) = p and E(X 2 ) = p. Thus the variance is var(x) = E(X 2 ) (E(X)) 2 = p(1 p). 2. Roll a 4-sided die (a tetrahedron) in which each face is equally liely to come up, and let the outcome be X {1,2,3,4}. We have two formulas for the variance: var(x) = E ( (X µ) 2) var(x) = E(X 2 ) µ 2 where µ = E(X). Let s try both and mae sure we get the same answer. First of all, µ = E(X) = (1+2+3+4)/4 = 2.5. Now, let s tabulate the distribution of X 2 and (X µ) 2. 3-4

Reading from this table, Prob X X 2 (X µ) 2 1/4 1 1 2.25 1/4 2 4 0.25 1/4 3 9 0.25 1/4 4 16 2.25 E(X 2 ) = 1 (1+4+9+16) = 7.5 4 E(X µ) 2 = 1 (2.25+0.25+0.25+2.25) = 1.25 4 The first formula for variance gives var(x) = E(X µ) 2 = 1.25. The second formula gives var(x) = E(X 2 ) µ 2 = 7.5 (2.5) 2 = 1.25, the same thing. 3. Roll a -sided die in which each face is equally liely to come up. The outcome is X {1,2,...,}. The expected outcome is E(X) = 1+2+ + = 1 2( +1) = +1, 2 using a special formula for the sum of the first integers. There s another for the sum of the first squares, from which Then E(X 2 ) = 12 +2 2 + + 2 = var(x) = E(X 2 ) (E(X)) 2 = The standard deviation is thus approximately / 12. 1 6( +1)(2 +1) ( +1)(2 +1) 6 = ( +1)2 4 4. X is the number of fixed points of a random permutation of (1,2,...,n). ( +1)(2 +1). 6 = 2 1 12. Proceeding as before, let X i be 1 if i is a fixed point of the permutation, and 0 otherwise. Then E(X i ) = 1/n. For i j, the product X i X j is 1 only if both i and j are fixed points, which occurs with probability 1/n(n 1) (why?). Thus E(X i X j ) = 1/n(n 1). Since X is the sum of the individual X i, we have E(X) = 1 and E(X 2 ) = E((X 1 + +X n ) 2 ) n = E X i X j = i i=1x 2 i + i j E(X 2 i)+ i j E(X i X j ) = n 1 n +n(n 1) 1 n(n 1) = 2. Thus var(x) = E(X 2 ) (E(X) 2 ) = 1. This means that the number of fixed points has mean 1 and variance 1: in short, it is quite unliely to be very much larger than 1. 3-5

3.3.3 Another property of the variance Here s a cartoon picture of a well-behaved distribution with mean µ and standard deviation σ (that is, µ = E(X) and σ 2 = var(x)). σ σ The standard deviation quantifies the spread of the distribution whereas the mean specifies its location. If you increase all values of X by 10, then the distribution will shift to the right and the mean will increase by 10. But the spread of the distribution and thus the standard deviation will remain unchanged. On the other hand, if you double all values of X, then its distribution becomes twice as wide, and thus its standard deviation σ is doubled. Which means that its variance, which is the square of the standard deviation, gets multiplied by 4. In summary, for any constants a,b: µ var(ax +b) = a 2 var(x). Contrast this with the mean: E(aX +b) = ae(x)+b. 3.3.4 Linearity of variance If X and Y are independent random variables, then var(x + Y) = var(x) + var(y). More generally, if X 1,...,X n are independent, then var(x 1 + +X n ) = var(x 1 )+ +var(x n ). In contrast, linearity of expectation (E(X +Y) = E(X)+E(Y)) holds even if the random variables are not independent. 3-6