Probability. Hosung Sohn

Size: px

Start display at page:

Download "Probability. Hosung Sohn"

Preston Sherman
6 years ago
Views:

1 Probability Hosung Sohn Department of Public Administration and International Affairs Maxwell School of Citizenship and Public Affairs Syracuse University Lecture Slide 4-3 (October 8, 2015) 1/ 43

2 Table of Contents 1 Means and Variances of Random Variables 2/ 43

3 Announcement Revised Lecture Note 4. Lecture Note 5 will be posted by weekend. Please submit midterm evaluation forms you received via . Problem Set 2 will be posted on Friday (October 9, 2015): = Due on October 20, 2015! A mistake in the syllabus: = The deadline for Problem Set 4 is December 8, 2015, not December 1, / 43

4 Review of Previous Lecture Random variable: = A variable X is a random variable if the value that X takes at the conclusion of an experiment is a chance or random occurrence that cannot be predicted with certainty in advance. Two types of random variables: 1. Discrete random variables. 2. Continuous random variables. 4/ 43

5 Review of Previous Lecture Discrete random variables: = A random variable X is discrete if X can take only a finite number of different values. Discrete probability distributions: = A discrete probability distribution is a table, graph, or rule that associates a probability P (X = x i ) with each possible value x i that the discrete random variable X can take. 5/ 43

6 Review of Previous Lecture Discrete probability distribution in tables: Value of X x 1 x 2 x 3 x 4 Probability P (X = x 1) P (X = x 2) P (X = x 3) P (X = x 4) Discrete probability distributions in figures: 6/ 43

7 Review of Previous Lecture Continuous random variables: = A random variable X is continuous if X can take all the values in some interval. Continuous probability distributions: = A continuous probability distribution of a continuous random variable X is described by a density curve. The probability of any event is the area under the density curve and above the values of X that make up the event. 7/ 43

8 Review of Previous Lecture If X is a continuous random variable, P (X x i ) = P (X > x i ): = P (X = x i ) = 0. One example of continuous probability distributions: = Uniform distributions. Another example of continuous random variables: = Normal distributions. 8/ 43

9 Review Exercise 1 Suppose the population proportion of Internet users who say they use Twitter to post updates about themselves is 19%. Think about selecting random samples from a population in which 19% are Twitter users. 1. What is the sample space for selecting a single person? = S = {Y, N}. 2. If you select three people, what is the sample space? = S = {Y Y Y, NY Y, Y NY, Y Y N, NNY, Y NN, NY N, NNN}. 3. Define the sample space for the random variable that expresses the number of Twitter users in the sample of size 3. = X = {0, 1, 2, 3}. 4. What information is contained in the sample space for Question 2 that is not contained in the sample space for Question 3? = The sample space in Question (2) reveals which of the three people use Twitter. 9/ 43

10 Review Exercise 2 The Twitter example continued: 1. Assign probabilities for S = {Y, N}? = P (Y ) = 0.19 and P (N) = For S = {Y Y Y, NY Y, Y NY, Y Y N, NNY, Y NN, NY N, NNN}? = P (Y Y Y ) = P (Y ) P (Y ) P (Y ) = = (why?). = P (NY Y ) = P (N) P (Y ) P (Y ) = = = P (NNY ) = P (N) P (N) P (Y ) = = = P (NNN) = P (N) P (N) P (N) = = Probability distributions for the random variable X. Outcome YYY NYY YNY YYN NNY YNN NYN NNN Value of X Probability = = / 43

11 Introduction When describing data, we used tables and graphs (e.g., histograms, scatterplots etc.). Similarly, when describing random variables, we used tables and graphs. = We used tables and graphs to describe probability distributions of discrete or continuous random variables. On the other hand, we also used numerical measures to describe data (e.g., means, variance, etc.). We can also use numerical measures to describe random variables. = We can estimate the mean or the standard deviation of random variables. 11/ 43

12 The Expected Value (or Mean) of a Random Variable When talking about the mean of a random variable, we use the term expected value of a random variable. The expected value of a random variable is used as a measure of the center of the probability distribution of the random variable X. And it is denoted as E(X) or µ X. Recall that a statistic such as x can be considered as a random variable. = So we can define the expected value of x; i.e., E( x). Difference between E( x) and x? 12/ 43

13 The Expected Value (or Mean) of a Random Variable E( x) vs. x 13/ 43

14 The Expected Value (or Mean) of a Random Variable The expected value of a discrete random variable X is E(X) = µ X = k x i p i i=1 = x 1 p 1 + x 2 p x k p k where x i is a value of X and p i is the corresponding probability: = i.e.) p i = P (X = x i ). The mean is called the expected value because it denotes the average value that we would expect to occur if the experiment were repeated a large number of times. Another way to think about the expected value: = A weighted average in which each outcome (i.e., x i ) is weighted by its probability. 14/ 43

15 The Expected Value (or Mean) of a Random Variable Example: Suppose a random variable X denotes the years of education before entering the MPA program for our class. And assume that we have the following probability distribution for X: Value of X (i.e., Years of Education) Probability What is E(X)? Using the formula for the expected value, E(X) = 3 x i p i i=1 = x 1 p 1 + x 2 p 2 + x 3 p 3 = = / 43

16 The Expected Value (or Mean) of a Random Variable The above example illustrates the calculation of the expected value of a discrete random variable. How do we calculate E(X) if X is a continuous random variable. The formula for the expected value of a continuous random variable is E(X) = b a xf(x)dx, where f(x) is a probability function of a continuous variable. Intuitively, the expected value of a continuous random variable is the point at which the area under density curve would balance. 16/ 43

17 The Expected Value (or Mean) of a Random Variable 17/ 43

18 Statistical Estimation and the Law of Large Numbers Our goal in using statistical science: = Estimate the population parameter using a statistic! Suppose we want to estimate the mean height µ of the population of all American women between the ages of 18 and 24 years. To estimate µ: 1. We draw an SRS of young women. 2. Use the sample mean x to estimate µ. To reiterate, µ is a parameter and x is a statistic. 18/ 43

19 Statistical Estimation and the Law of Large Numbers Statistics such as x obtained from probability sampling designs are random variables. Why? = We don t know their values until we draw an SRS, and their values vary in repeated sampling. Thus, we can think of the sampling distributions of these statistics as the probability distributions of these random variables. 19/ 43

20 Statistical Estimation and the Law of Large Numbers We also learned that it is reasonable to use x to estimate µ. = An SRS should fairly represent the population, so x should be somewhat near µ. But we don t expect x to be exactly equal to µ. And we know that if we draw another SRS, then it would give us a different x. 20/ 43

21 Statistical Estimation and the Law of Large Numbers If x is rarely right and varies from sample to sample, why are we using this to estimate µ? One answer we learned is that it is because x is an unbiased estimator for µ. Another reason: if we keep on increasing the sample size when we draw an SRS, the statistic x is guaranteed to get as close as we wish to the parameter µ. = This fact is called the law of large numbers (LLN). LLN is very useful law because the law holds for any population, regardless of the shape or spread of the distribution of population data. 21/ 43

22 Statistical Estimation and the Law of Large Numbers Law of Large Numbers (LLN) Definition The law of large numbers (LLN) states that as the number of observations drawn increases in a single SRS, the mean x eventually approaches the mean µ of the population. 22/ 43

23 Statistical Estimation and the Law of Large Numbers Suppose that the mean of all women is 64.5 inches; i.e., µ = Figure below shows the behavior of the mean height x of n women chosen at random from a population. 23/ 43

24 Statistical Estimation and the Law of Large Numbers At first, the graph shows that the mean of the sample changes as we take more observations. Eventually, however, the mean gets close to the population mean µ = 64.5 and settles down at that value. = LLN says that this always happens. 24/ 43

25 Statistical Estimation and the Law of Large Numbers LLN is intuitively clear. Suppose our population size is 10, If you take an SRS of size 100, then x based on this 100 observations is not exactly equal to µ that is based on 10, What if you take an SRS of 9,500. Then x based on this 9,500 observations would be almost equal to µ. 3. What if you take an SRS of 9,999. Then it is almost certain that x based on this 9,999 observations is equal to µ. This is what LLN is telling us about. 25/ 43

26 Statistical Estimation and the Law of Large Numbers So LLN tells us that if we draw a large number of observations in a single SRS, then x is almost equal to µ. But we can ask a question: how large is a large number? = The answer depends on the variability of the population. If our outcome of interest in population is so variable, then we need more observations. If our outcome of interest in population is not so variable, then LLN holds even if we don t have that many observations. 26/ 43

27 Statistical Estimation and the Law of Large Numbers Suppose we would like to estimate the mean salary of all the people in the US. = The salary level is so variable, so we need quite a large number of observations to exactly estimate the population mean salary. Suppose we would like to estimate the mean number of cars that the households in the US possess. = In general, the number of cars possessed by households does not vary to a great extent (maybe around one to three). = So we don t need a large number of observations to estimate the mean number of cars. 27/ 43

28 Rules for Means Sometimes, there are instances in which we want to find out the expected value of two or more random variables. There are some rules that come in handy when we calculate the expected value of such random variables. We will study four rules. 28/ 43

29 Rules for Means Rule 1: If X is a random variable and a and b are fixed numbers (i.e., constants), then E(a + bx) = a + be(x). = We say that a + bx is a linear transformation of the random variable X. Rule 2: If X and Y are random variables, then E(X + Y ) = E(X) + E(Y ). 29/ 43

30 Rules for Means Rule 3: If X and Y are random variables, then E(X Y ) = E(X) E(Y ). Rule 4: If we combine Rule 1 and Rule 2 or 3, then we have the following rule: E(a + bx + cy ) = a + be(x) + ce(y ) or E(a bx + cy ) = a be(x) + ce(y ) 30/ 43

31 Rules for Means But!!! The rule in general doesn t hold hold for multiplication and division. That is E(XY ) E(X)E(Y ) and E(X/Y ) E(X)/E(Y ) 31/ 43

32 Rules for Means Example: Let X and Y be random variables that denote the number of courses taken in the fall and spring semester, respectively, by students at the Maxwell School. And we have the following probability distributions. Value of X Probability Value of Y Probability If we pick a student randomly, what is the expected value of the number of courses in the both semesters? 32/ 43

33 Rules for Means Value of X Probability Solution: Value of Y Probability The question asks to solve E(X + Y ). So using Rule 2 above, we know that E(X + Y ) = E(X) + E(Y ). So the expected value is E(X + Y ) = E(X) + E(Y ) = ( ) + ( ) = = / 43

34 The Variance of a Random Variable The expected value is a numerical measure of the center of a probability distribution. We need another measure; i.e., the spread or variability of the probability distribution. = We use the variance and standard deviation of a random variable. We write the variance of a random variable X as V ar(x) or σ 2 X. 34/ 43

35 The Variance of a Random Variable The variance of a discrete random variable X is the expected value of the squared deviations from the mean and is given by the formula V ar(x) = σx 2 = E [(X E(X)) 2] = [x 1 E(X)] 2 p 1 + [x 2 E(X)] 2 p [x k E(X)] 2 p k k = (x i E(X)) 2 p i. i=1 Note that the variance can also be calculated by the following formula: V ar(x) = E(X 2 ) [E(X)] 2. 35/ 43

36 The Variance of a Random Variable Let s prove the alternative formula: V ar(x) = E(X 2 ) [E(X)] 2. Proof = E [(X E(X)) 2] = E [ X 2 2XE(X) + [E(X)] 2] = E ( X 2) 2E(X)E(X) + [E(X)] 2 (by the rules of the mean) = E(X 2 ) [E(X)] 2. In some cases, especially when the mean is not an integer, it may be easier to calculate the variance by using the alternative formula rather than the original formula. 36/ 43

37 The Standard Deviation of a Random Variable The standard deviation of a discrete random variable X is given by the formula SD(X) = σ X = V ar(x) Question: can you tell the difference between V ar(x) and s 2? Answer: 1. V ar(x) indicates the variability that arises from repeated sampling. 2. s 2 indicates the variability in the values among observations in a single sample. 37/ 43

38 The Standard Deviation of a Random Variable Example: Find the variance and the standard deviation of the random variable X that has the following probability distribution: Solutions: Value of X 0 3 Probability First, E(X) = = To calculate the variance, I will use the alternative formula: = V ar(x) = E(X 2 ) [E(X)] E(X 2 ) = = V ar(x) = E(X 2 ) [E(X)] 2 = = SD(X) = V ar(x) = / 43

39 The Rules of the Variance Some rules for the variance. Rule 1: If X is a random variable, and a and b are fixed numbers, then V ar(a + bx) = b 2 V ar(x). = Notice that the constant a disappears and b comes out as a squared term. Rule 2: If X and Y are random variables, then V ar(x + Y ) = V ar(x) + V ar(y ) + 2Cov(X, Y ) and V ar(x Y ) = V ar(x) + V ar(y ) 2Cov(X, Y ). = Notice the sign right before the covariance term. 39/ 43

40 The Rules of the Variance Rule 3: If X and Y are independent random variables, then V ar(x + Y ) = V ar(x) + V ar(y ) and V ar(x Y ) = V ar(x) + V ar(y ). = Notice that the covariance term disappears, as well as the sign (both are positive). = Rule 3 above implies that if X and Y are independent, then Cov(X, Y ) = 0 or Corr(X, Y ) = 0. 40/ 43

41 Independence Between Two Random Variables The notion of independence between two random variables is very important in statistics especially when you learn Econometrics. Suppose X is a random variable that indicates whether the students in the Maxwell School are taking PAI 721 Introduction to Statistics. And suppose that Y is a random variable that denotes whether the students are in the MPA program or not. Are the two random variables X and Y independent? 41/ 43

42 Independence Between Two Random Variables Students in the MPA program are required to take PAI 721. So if you know the value of Y for Student A, then you are more likely to know the value of X for this student. That is information you have regarding the random variable Y is helpful for determining the information of the random variable X. Hence, in this case, we say that the two random variables X and Y are not independent, or we say that the two random variables are dependent. 42/ 43

43 Independence Between Two Random Variables On the other hand, suppose X denotes the toe size of students in the Maxwell School. Knowing whether a student is in the MPA program will not help us from determining the toe size of students in the Maxwell School. So in this case, we can reasonably assume that X and Y are independent. 43/ 43

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek Bhrushundi