STAT 414: Introduction to Probability Theory Spring 2016; Homework Assignments Latest updated on April 29, 2016 HW1 (Due on Jan. 21) Chapter 1 Problems 1, 8, 9, 10, 11, 18, 19, 26, 28, 30 Theoretical Exercises 6, 8 HW2 (Due on Jan. 28) Chapter 2 Problems 2, 3, 8, 9, 11, 13, 14, 15, 24, 28, 35, 52 Theoretical Exercises 6, 11 HW3 (Due on Feb. 4) Problem 1: Probability basics and conditional probability basics using simulation. You can find R code here: http://www.personal.psu.edu/lxx6/hw3.r 1. Probability basics: Simulate 10 rolls of a die using R. This can be done by taking a sample of size 10, with replacement from the set {1, 2, 3, 4, 5, 6}. Do this twice and find the proportion of times the die roll lands on 6. Also repeat this twice for 100 rolls and twice for 1,000,000 (1 million) rolls respectively. Report the set of proportions you observe. You should have three sets of two numbers each. What do you notice about these proportions as you increase the number of rolls? This should agree with the statement that the probability of an event is its limiting relative frequency. 2. Conditional probability basics: You can repeat the same experiment as above, say for 10,000 rolls. Count the proportion of times you obtain a 6, but you should only look at die rolls that are either 4, 5, or 6. What do you obtain now? Repeat this experiment once 1
more and report your results. How do these two values compare to the true conditional probability of getting a 6 given that the die roll is known to be 4,5 or 6? Chapter 3 Problems 4, 12, 16, 18 (Hint: solve pard (d) first), 26, 32, 36, 46, 51 Theoretical Exercises 7, 9 HW4 (Due on Feb. 11) Problem 1: Expected value basics using simulation. Modify the R code you used from the last homework http://www.personal.psu.edu/lxx6/hw3.r to answer this question. Simulate 10 rolls of a die using R. This can be done by taking a sample of size 10, with replacement from the set {1, 2, 3, 4, 5, 6}. Do this twice and find the average of the die roll. The R command to do this is: mean(die) where die is the name of the variable that contains all the die rolls. (You will use mean(die) to obtain the mean of the die rolls just like you used sum(die==6)/length(die) to find the proportion of die rolls that was 6 in the last homework.) Repeat this twice for 100 rolls and twice for 10,000 rolls respectively. Report the sample averages in each case. You should have three sets of two numbers each. What do you notice about these sample averages as you increase the number of rolls? Chapter 3 Problems 60, 64, 66, 71, 78, 84, 89 Theoretical Exercises 5, 6 HW5 (Practice Questions for 1st Midterm; HW5 will not be graded) Note: Show all work for full credit! Small mistakes in arithmetic will not reduce credit if you show work; conversely, even a correct answer could get no credit without supporting work. Problem 1: Consider two bags of marbles. The 1st bag has 5 red and 5 white marbles, and the 2nd bag has 2 red and 8 white marbles. A bag is randomly selected and a marble is draw at random from the selected bag. Please answer the following questions: a) What is the probability that the marble is red? b) Given that the marble is red, what is the probability that the 1st bag was the one selected? Problem 2: Randomly pick 5 cards from a deck of 52 cards without replacement. Answer the following questions: 2
a) What is the probability that you get exactly 2 red cards and 3 black cards? (red means heart or diamond, and black means spade or club) b) Suppose that a card is randomly removed from the deck, and then the experiment begins to select 5 cards from the remaining 51 cards. What is the odds of getting at least 1 red card in your 5-card hand? Problem 3: Roll a 6-sided die repeatedly. A match occurs if at the ith roll a number i is observed. For example, if we observe the number 5 at the 5th roll, we get a match. The experiment ends when either a match occurs or the die has been rolled 6 times. a) Find the probability mass function of X = the total number of rolls performed. b) Calculate P (1 < X < 4). c) Calculate P (X > 1). Problem 4: A track star runs two races on a certain day. The probability that she wins the first race is 0.7. The probability that she wins the second race is 0.6. The probability that she wins both races is 0.5 Answer the following questions: a) Find the probability that she wins at least one race. b) Find the probability that she wins exactly one race. c) Find the probability that she wins neither race. Problem 5: Drawer A contains 4 pennies and 5 dimes. Drawer B contains 6 pennies and 9 dimes. A drawer is selected at random, and a coin is selected at random from that drawer. Answer the following questions: a) Find the probability of drawing a dime. b) Suppose a dime is obtained. What is the probability it came from drawer B? Problem 6: Let P (A) = P (B) = 1/3 and P (A B) = 1/10. Find the following: a) P (B ) b) P (A B ) c) P (A B ) Problem 7: The discrete random variable X has the probability mass function f(x) = cp x for x = 0, 1,..., and p (0, 1). Answer the following questions: a) What is the value of c? b) Calculate P (X > 2). 3
HW6 (Due on Feb. 25) Problem 1: Studying the Binomial distribution using R. Modify this R code http://www. personal.psu.edu/lxx6/hw5.r to answer this question. Let X Binomial(n = 20, p = 0.2). 1. Find P(X = 3) using the R function dbinom. This is the pmf of X evaluated at 3, f X (3). 2. Find P(X 3) using the R function. This is the cdf of X, evaluated at 3, F X (3). 3. Now suppose Y Binomial(100,0.4). Using R, find P (Y > 83). Write down the commands you used along with the answer. 4. Simulation: (a) Simulate 10 realizations of the Binomial random variable X above. Find proportion of times X is 3. Do this two more times, i.e., take 10 more realizations of the Binomial random variable X and find the proportion of times X is 3. You should report the three proportions. Compare these three with the true probability X is 3. (b) Repeat the above exercise but with 100 realizations. Report the three proportions and compare these three with the true probability X is 3. (c) Repeat the above exercise but with 10000 realizations. Report the three proportions and compare these three with the true probability X is 3. (d) What do you notice as the number of realizations increases? Explain how this is related to the notion of the pmf of X. (Explain basic notions of the definition of probability in this context.) Chapter 4 Problems 39, 40, 41, 44, 48, 49, 55, 57, 61, 71, 78, 79 HW7 (Due on Mar. 3) Chapter 4 Problems 17, 19, 22, 23, 30, 35, 38, 42, 51 Theoretical Exercises 4, 5, 27 HW8 (Due on Mar. 17) Chapter 5 Problems 2, 3, 5, 6, 7, 8, 14, 31, 39, 40 Theoretical Exercises 2, 3, 9, 11 4
MIT Open Course: Multivariable Calculus If you did not learn Multivariable Calculus, please follow the following video lectures during the spring break. Multivariable Calculus is a very important subject in this class. Lecture 16: Double Integrals Video link: http://ocw.mit.edu/courses/mathematics/18-02-multivariable-calculusfall-2007/video-lectures/lecture-16-double-integrals/ Lecture 17: Polar Coordinates Video link: http://ocw.mit.edu/courses/mathematics/18-02-multivariable-calculusfall-2007/video-lectures/lecture-17-polar-coordinates/ Lecture 18: Change of Variables Video link: http://ocw.mit.edu/courses/mathematics/18-02-multivariable-calculusfall-2007/video-lectures/lecture-18-change-of-variables/ Moreover, you can find the lecture notes at http://ocw.mit.edu/courses/mathematics/18-02-multivariable-calculus-fall-2007/lecture-notes/ HW9 (Due on Mar. 24) Chapter 5 Problems 15, 16, 19, 20, 23, 28, 31, 32, 37, 40 Theoretical Exercise 13, 31 HW10 (Due on Apr. 5) Chapter 6 Problems 2, 7, 8, 10, 20, 21, 27 HW11 (Practice for 2nd Midterm; HW11 will not be graded) Problem 1. The waiting time between arrivals of aircraft at State College airport is exponentially distributed with a mean of one hour. (a) What is the probability that more than three aircraft arrive within an hour? 5
(b) If 30 separate one-hour intervals are chosen, what is the probability that no interval contains more than three arrivals? (c) Given that no aircraft has arrived in 45 minutes, what is the probability of an arrival in the next 30 minutes? Problem 2. In a clinical study, volunteers are tested for a gene that has been found to increase the risk for a disease. The probability that a person carries the gene is 0.1. (i) Let Y be the number of people that have to be tested before one with the gene is detected. State the distribution of Y and its support as well as the values of its parameters. (ii) What is the probability that four or more people will have to be tested before one with the gene is detected? (iii) How many people are expected to be tested before one with the gene is detected? Problem 3. A continuous random variable X has the following probability density function { axe x/b if x > 0 f(x) = 0 if otherwise where a and b are two unknown positive numbers. We are also given that E(X) = 2. Now answer the following questions: (a) Find the values of a and b. (b) Find the expectation of X. (c) Find the variance of X. (d) Find P ( X 2 > 1). Problem 4. The height, in inches, of a randomly chosen American woman is a normal random variable with mean µ = 64 and variance σ 2 = 9. Now answer the following questions: (a) Find the probability that the height of a randomly chosen woman is between 59.8 and 68.2 inches. 6
(b) Given that a randomly chosen woman is tall enough to be an astronaut (i.e., at least 59 inches tall), what is the conditional probability that she is at least 67 inches in height? (c) What is the height that 95% of American women exceed? (d) Find E[( X 64 3 ) 3 ]. Note: use the normal probabilities table (i.e., Z table). Problem 5. (a) Suppose X Uniform(a,b), with a < b, that is, X is a uniform random variable on the interval [a,b]. Show that E(X) is a+b. (For full credit please be sure to show all steps.) 2 (b) Let Y Uniform(0,2). Derive the cdf for Y, F Y (y). For full credit you must show all your work and you should specify the value of F Y (y) everywhere on the interval from (, ), not just on the interval (0,2). (c) Let Y Uniform(0,2)and W Uniform(1,4) and define a random variable Z = 2Y W. Assume Y and W are independent. What is the expected value of Z? (d) Let V = 2Y 2. What is the variance of V? [2pts] (e) Suppose that Y and W are not independent. Would the expected value of Z remain the same or could it change? Problem 6. Let W be a discrete random variable with p.m.f. f W (w) = 1 w 4 for w {0, 1,..., 8}. 20 (a) Find E(W ). (b) Find Var(W ). (c) Find Var[(W 4) 2 ]. Problem 7. Please answer the following questions: 1. I toss a fair coin twice. X is a random variable that equals 1 if the first toss is heads, and 0 otherwise. Y is a random variable that equals 1 if the second toss is heads and 0 otherwise. Display the joint p.m.f of (X, Y ), f X,Y (x, y) in a table. 7
2. What are the pmfs of X and Y (marginal pmfs)? 3. What is the probability XY = 0? 4. What is the correlation between X and Y? Problem 8. X 1, X 2 are independent Exp(1) random variables. Let Y 1 = X 1 X 2 and Y 2 = X 1 + X 2. 1. Find the joint distribution of X 1, X 2. 2. Find the joint distribution of Y 1, Y 2. 3. Find the marginal pdf of Y 1. HW12 (Due on Apr. 14) Chapter 6 Problems 55, 58 Chapter 7 Problems 4, 33, 38, 40, 45 HW13 (Due on Apr. 26) Chapter 7 Problems 51, 40 (Hint: use law of iterated expectation to calculate E(X) and E(XY )), 53, 65 Chapter 8 Problems 2, 7, 13 HW14 (Practice for Final Exam) 1. Tossing a biased coin, which has a 60% chance of coming up heads and a 40% chance of coming up tails. I toss this biased coin twice. Tosses are independent. X is a random variable that equals 1 if the first toss is head, and 0 otherwise. Y is a random variable that equals 1 if the second toss is head and 0 otherwise. Namely, P (X = 1) = 0.6 and P (X = 0) = 0.4; P (Y = 1) = 0.6 and P (Y = 0) = 0.4. Let Z = X + Y. Please answer the following questions: (a) Display the joint p.m.f of (X, Y ) (i.e. f X,Y (x, y)) in a table. 8
(b) What is the probability mass function of Z? (c) Find the value of E[Z]. (d) U is a random variable that equals 0.84 if Z equals 0 and -0.16 otherwise. It is obvious that Z and U not independent. What is the covariance between Z and U? What is the correlation between Z and U? (e) Are Z and U independent? 2. The joint PDF of X and Y is f(x) = cx 2 yi {0<x<y,0<y<1} (a) Find c. (b) Are X and Y independent? Be sure to explain clearly why or why not. (c) Find the marginal distribution of Y. (d) Find E[X Y = y] for any value of y between 0 and 1. (e) Find E[XY ]. 3. Let X be a random variable with pdf f(x) = 4x 3 I {0<x<1} (a) Find the density for Y = log(x). (b) Find the density for Z = 1 e x 4. Suppose on a game show, a contestant starts with $10. She gets to choose from one of 3 boxes at random, that is she chooses each box with probability 1/3 and she cannot identify the boxes (they all look the same). If she picks Box A, she has to pay $2 and when she opens the box she wins a prize and stops playing the game. If she picks Box B, she has to pay $3 and then she plays the game again. If she picks Box C, she has to pay $5 and then she plays the game again. Each time she plays the game she is presented all 3 boxes again and picks one of the 3 at random (she cannot identify the boxes). If she keeps playing until she wins a prize what is the expected money she has at the end? Note that it is possible for her to end up with negative amounts of money, that is, if she has no money left from the $10 she has to pay out of her own pocket and will continue to play. So if she spends $25 until she wins a prize that means the money she has left is $10-$25=-$15. (Hint: Use conditioning/law of iterated expectations) 9
5. Let X be a random variable with probability density function f(x) = xe x if 0 < x <, and 0 otherwise. (a) Find P ( X + 4 6) (b) Find the moment generating function of X. (c) Find the expectation and the variance of X. 6. An urn contains b black balls and r red balls. One of the balls is drawn at random, but when it is put back in the urn, c additional balls of the same color are put in with it. Now suppose that we draw another ball. (a) Find the probability that the first ball was black AND the second ball is red. (b) Find the probability that the first ball was black GIVEN that the second ball drawn is red. 7. A type C battery is in working condition with probability 0.6, and a type D battery is in work condition with probability 0.5. A battery is randomly chosen from a bin consisting 6 type C batteries and 8 type D batteries. Let A 1 = {the chosen battery is type C}, A 2 = {the chosen battery is type D} and B = {the chosen battery works}. Please answer the following questions: (a) Find P (A 1 ) and P (A 2 ). (b) Find the probability that the chosen battery works (i.e. P (B)). (c) Given that the battery does not work, find the probability that it was a type D battery (i.e. P (A 2 B c )). 8. Let X be the yearly claim (in dollars) of a randomly chosen policyholder of MAICO, an auto insurance company. It is known that E(X) = 750 and Var(X) = 62, 500. (a) Show that P (250 X 1, 250) > 3 4 [Hint: Chebyshev s Inequality] (b) State the Central Limit Theorem for a sequence of independent, identically distributed random variables X 1, X 2,..., each having mean µ and variance σ 2. (c) Let X be the average claim of 10,000 randomly chosen MAICO policyholders. Show that P (745 X 755) 0.9544. [Note: You may ignore the continuity correction.] 10
Study Guide for Final Exam Chapter 1 3 Combinatorial analysis: basic principle of counting, permutation, and combination Axioms of probability Inclusion-exclusion identity: P (A B) = P (A) + P (B) P (A B) Definition and properties of conditional probabilities Independence Law of total probabilities Bayes s theorem Chapter 4 Discrete Random Variables: Definition and properties of p.m.f. f(x), c.d.f. F (x), E(X), V ar(x), and SD(X) Linearity for expectations E( i X i) = i E(X i) (CAUTION: generally linearity does not hold for variances) Definition and properties of Bernoulli, Discrete Uniform, Binomial, Geometric and Poisson distributions, including their p.d.f., c.d.f., E(X), V ar(x) and m.g.f. Poisson approximation to Binomial distributions Chapter 5 Continuous Random Variables: Definition and properties of p.d.f. f(x), c.d.f. F (x), E(X), V ar(x), and SD(X) Differentiating F (x): f(x) = d dx F (x) Definition and properties of Continuous Uniform, Exponential, Normal (Gaussian), Gamma and Cauchy, including their p.d.f., c.d.f., E(X), V ar(x) and m.g.f. Properties of Normal random variables: graphs of f(x), linear transformation Y = ax +b, standard normal random variable N(0, 1), Z transformation Z = X µ, and Z table σ Memoryless properties: Exponential distribution and Geometric distribution Normal approximation to Binomial distributions 11
Find the p.d.f. of Y = g(x): change of variable techniques If U is uniformly distributed on [0, 1], X = F 1 (U) has the cumulative distribution function F (x); on the other hand, if X has the cumulative distribution function F (x), U = F (X) is uniformly distributed on [0, 1] Definition and properties of Poisson process (discrete value, continuous time): counting process N(t+s) N(s) Poi(λt) for any s 0, waiting time for next event T Exp(λ), waiting time for next k-th event T k Chapter 6 Jointly Distributed Random Variables Gamma(k, λ), independent increments Definition and properties of joint/marginal p.m.f. or p.d.f. f(x, y), f X (x) and f Y (y) Derive the expectations E(g(X, Y )) and the variances V ar(g(x, Y )) Independence of random variables Sums of independent Bernoulli/Binomial/Normal/Exponential/Poisson random variables Find the joint p.d.f. of (Y 1, Y 2 ) where Y 1 = g 1 (X 1, X 2 ) and Y 2 = g 2 (X 1, X 2 ) Bivariate Normal distribution: joint/marginal p.d.f., correlation and independence Chapter 7 Properties of Expectations Covariance and correlation V ar(ax + by ) = a 2 V ar(x) + b 2 V ar(y ) + 2abCov(X, Y ) Define X n = 1 n (X 1 + X n ), where X i s are independent with the same mean µ and variance σ 2. Then E( X n ) = µ and V ar( X n ) = σ2 n Independence and correlation Conditional distribution and conditional expectation Law of iterated expectations (LIE): E(g(X)) = E(E(g(X) Y )) and V ar(g(x)) = V ar(e(g(x) Y )) + E(V ar(g(x) Y )) Let Y = N i=1 X i where X 1, X 2, have the same mean µ and variance σ 2. Then E(Y ) = µe(n) and V ar(y ) = σ 2 E(N) + µ 2 V ar(n). Moment generating function 12
Chapter 8 Limit Theorems Markov/Chebyshev inequality Central Limit Theorems (CLT) Application of Central Limit Theorems 13