Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices.

Quiz 1. Name: Instructions: Closed book, notes, and no electronic devices. 1. What is the difference between a deterministic model and a probabilistic model? (Two or three sentences only). 2. What is the difference between DATA and data? (Two or three sentences only).

Quiz 2. Name: Instructions: Closed book, notes, and no electronic devices. 1. Suppose the derivative of a curved function f(x) at x = 3.5 is given by f (3.5) = 0.7. Draw a graph that illustrates this fact. 2. Suppose the integral of a positive curved function over the range from 1 to 2 is 3.0; i.e., suppose 2 that f ( x) dx 3. 0. Draw a graph that illustrates this fact. 1 3. Suppose the derivative of a curved function f(x) at x = 3.5 is given by f (3.5) = 0.7. Explain how this fact appears in a graph of the function f(x). 4. Suppose the integral of a positive curved function over the range from 1 to 2 is 3.0; i.e., suppose 2 that f ( x) dx 3. 0. Explain how this fact appears in a graph of the function f(x). 1

Quiz 3. Name: Instructions: Closed book, notes, and no electronic devices. You are employed by a company to evaluate credit-worthiness of loan applicants. The next person that will walk into your office will have annual income Y. In the absence of any other information about that person, give your model for Y.

Quiz 4: Name: Closed books, notes, and no electronic devices. Here is the normal q-q plot of the call center data from the book: The data points do not lie on a straight line. As described in the book, how can you tell if the differences between the points and the line are explainable by chance alone? (1-3 sentences maximum).

Quiz 5: Name: Closed books, notes, and no electronic devices. There is a probability that a car purchaser will select a red car. What does the phrase nature favors continuity over discontinuity tell you about how this probability relates to the age of the purchaser?

Quiz 6 Name: Closed book, notes, and no electronic devices. 1. Which mantra applies when deciding whether to use p(y x) versus p(x y)? A. model produces data B. nature favors continuity over discontinuity C. use what you know to predict what you don t know 2. Suppose p(x,y) is a continuous joint distribution. Then p(x,y) dxdy = A. The mean B. The variance C. The covariance D. 1.0 3. Which formula gives you the conditional distribution p(y x)? A. p(y x) dxdy B. p(x,y) dx C. p(x,y)/p(x) 4. Suppose X ~p(x) and Y~p(y), independent of X. Then the joint distribution is given by p(x,y) =. A. p(x)p(y) B. p(x,y) dxdy C. p(x,y)/p(x)

Quiz 7 Name: Closed book, notes, and no electronic devices. 1. You are on a drunk driving excursion. What is the probability that you will either kill someone or be killed on this excursion? Give your best guess. 2. 60% of car purchasers at a dealerships are younger. Among the younger purchasers at this dealership, 50% buy a red car. Out of the next 100 customers, about how many will be younger customers who purchase a red car? 3. In the housing expense example, what distribution was assumed for income? A. Uniform B. Normal C. Poisson D. Bernoulli 4. In the psychometric evaluation example, what distribution was assumed for stealing behavior? A. Uniform B. Normal C. Poisson D. Bernoulli

Quiz 8 Name: Closed book, notes, and no electronic devices. You will randomly sample a single person from a finite population of N=1,000 people, and ask that person, Do you like lemonade? The answer will be either Y = yes or Y = no. Of the 1,000 people, suppose that 280 would answer yes and 720 would answer no. Write down the population p(y) in list form. Quiz 9. Name: Closed book, notes, and no electronic devices. It is common to assume that data Y1, Y2 are independent and identically distributed (iid). Briefly give a real example of data Y1, Y2 that are not identically distributed: Y1 = Y2 =

Quiz 10. Name: Closed book, notes, and no electronic devices. The graph of a continuous probability distribution function (pdf) p(y) is shown below. Guess the value of E(Y): p(y) 0.0 0.2 0.4 0.6 0.8 0 2 4 6 8 10 12 y

Quiz 11. Name: Closed book, notes, and no electronic devices. Here is a distribution that produces Y. y p(y) 1 1/2 2 1/4 3 1/4 1.0 Find E(1 + 4Y) using the linearity property of expectation.

Quiz 12. Name: Closed book, notes, and no electronic devices. 1. (30) Show using calculus that f(y) = y 2 is a convex function. 2. (30) Here is a distribution that produces Y. y p(y) 1 1/4 2 2/4 3 1/4 1.0 It is clear that E(Y) = 2; don t show this. It is also easy to show that E(Y 2 ) = 18/4 (or 4.5); don t show this either. Instead, use your answer to 1. to answer this: How do the facts that E(Y) = 2 and E(Y 2 ) = 4.5 illustrate Jensen s inequality? 3. (20) The mean of students commute times 10 minutes and the standard deviation is 4 minutes. What percentage of commute times are between 2 and 18 minutes? A. at least 75% B. at least 95% C. exactly 75% D. exactly 95%

Quiz 13. Name: Closed book, notes, and no electronic devices. All mathematical theorems are logical statements of the form If A is true, then B is true. For example, if A is that animal is a cow, and B is that animal is a mammal, we know that if A is true, then B is true. The Central Limit Theorem is similar. There is an assumption (the A ), and a conclusion (the B ). If the assumption (A) is true, then the conclusion (B) is true. Give the assumption (the A) and the conclusion (the B) for the Central Limit Theorem. Assumption: If Conclusion: Then

Quiz 14. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. Suppose Y 1, Y 2 ~ iid N(, 2 ). Show, step by step, with justifications, that (Y 1 + Y 2)/2 is an unbiased estimator of. Suppose Y 1, Y 2 ~ iid Bernoulli( ). Show, step by step, with justifications, that (Y 1 + Y 2)/2 is an unbiased estimator of.

Quiz 15. Name: Closed book, notes, and no electronic devices. Note: Problem 4 is a different kind of multiple choice problem. 1. (20) Which distribution is useful as a model for a process that produces occasional outliers? A. Mixture B. Normal C. Uniform D. Beta 2. (20) Suppose Y ~ p(y), where p(y) is the Poisson pdf. Then A. E(Y) = 1 B. E(Y) = Var(Y) C. E(Y) = y0.5 D. E(Y) = 3. (20) The variance of an estimator ˆ is 10 and its bias is 2. Then ESD ( ˆ) = E ˆ 2 {( ) } = A. 12 B. 14 C. Var ( ˆ) D. 0.0 4. (Select all that apply; 4 points for each correct selection/non-selection.) Suppose Y1, Y2,, Yn ~iid p(y), where the mean of the distribution p(y) is. Let Y1 Y2 Yn Y n Then Y is. A. an unbiased estimator of B. an efficient estimator of C. a consistent estimator of D. equal to when n is large. E. a normally distributed estimator of when n is large

Quiz 16. Name: Closed book, notes, and no electronic devices. 1. (20) Data Y are produced by the N(, 2 ) distribution. What is the parameter space for? A. The set {0, 1}. B. The set {-3, -2, -1, 0, 1, 2, 3}. C. The set of all numbers between 0 and 1. D. The set of all numbers greater than 0. 2. (60) Suppose the data Yi are produced as iid from the Bernoulli( ) distribution. The data values are y1 = 1, and y2 = 1. Write down L( ), the likelihood function for.

Quiz 17. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. 1. (40) Here is a graph of a likelihood function for a parameter ( theta ). The MLE is ˆ 0. 6. Give your best guess (a single number) of the Wald standard error of ˆ as indicated by the graph. L.theta 0 1 2 3 4 5 0.0 0.2 0.4 0.6 0.8 1.0 theta 2.(20) The log-likelihood function (called LL in the book) for an iid sample is equal to A. the pdf for the sample. B. the sum of the pdfs for each observation in the sample. C. the sum of the logarithms of the pdfs for each observation in the sample. D. the product of the pdfs for each observation in the sample. 3.(20) Suppose y1, y2,, yn are produced as an iid sample from N(, 2 ), where and are unknown parameters. Then the MLE of is 1 n A. ( y 2 i i y) 1 n 1 n B. ( y 1 n 1 i i y ) 2 1 n 2 1 n C. ( y i 1 i ) D. n ( y i 1 i ) n 1 2

Quiz 18. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. 1. Draw a graph of a prior distribution that expresses your prior ignorance about the Bernoulli parameter. Label and number both axes. 2. Draw a graph of a posterior distribution about the Bernoulli parameter. Label and number both axes.

Quiz 18. Name: Closed books, notes and no electronic devices. 1. Hans got two successes and 8 failures in both his thumbtack toss experiment and in his coin toss experiment. His likelihood function for the thumbtack toss data was his likelihood function for the coin toss data. A. identical to B. shifted to the left of C. shifted to the right of 2. The is what you use to express your uncertainty about the parameters before collecting your data. A. prior distribution B. posterior distribution C. likelihood function 3. Hans gives his prior distribution for = mean driving time as follows: p( ) 20.0 min 0.4 20.5 min 0.6 Total 1.0 Hans prior is an example of a prior. A. non-informative B. vague C. uniform D. dogmatic 4. The function is the kernel of the beta( ) distribution. A. 1, 2 B. 0, 1 C. 1/3, 2/3 D. 2, 3

Quiz 19. Name: Closed book, notes, and no electronic devices. 1. What is Markov Chain Monte Carlo? A. A gambling strategy B. A financial investment strategy C. A method for simulating data from p(data ) D. A method for simulating data from p( data) 2. Why is the prior p( ) = 1, for < <, called improper? A. Because the area under the curve is infinity B. Because distributions cannot be negative numbers C. Because it is dogmatic D. Because it cannot be used in practice 3. When do you need to use Bayesian statistics? A. When the normality assumption is violated B. When the data are not produced by a model C. When you wish to select plausible values of the parameters D. When you have no idea what is your prior distribution 4. Past performance is no guarantee of future performance. So the parameters that govern future financial markets are unknown. How does the book suggest that you select these parameters? A. By calculating them from future data B. By simulating them from their posterior distribution, given past data C. By asking experts (such as Jimmie Buffet) to give them to you D. By using logistic regression on historical financial data

Quiz 19. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. 1. In the Bayesian logistic regression example, it was found that A. Success probability increases with greater experience B. Success probability decrease with greater experience C. Success probability sometimes increases, sometimes decreases with greater experience D. Success probability is independent of experience 2. In the stock return example, what was the approximate probability that the mean return,, was more than 0? A. 0.0 B. 0.30 C. 0.70 D. 1.0 3. Why is the prior p( ) = 1, for < <, called improper? E. Because it cannot be used in practice F. Because the area under the curve is infinity G. Because distributions cannot be negative numbers H. Because it is dogmatic 4. When do you need to use Bayesian statistics? E. When you wish to select plausible values of the parameters F. When the normality assumption is violated G. When the data are not produced by a model H. When you have no idea what is your prior distribution

Quiz 20. Name: Closed book, notes and no electronic devices. Turn quiz over when done. 1. When you say that a frequentist confidence interval is an approximate 95% interval, what does the word approximate mean? A. The distribution p(y) in the model Y1, Y2,, Yn ~iid p(y) is not exactly normal, it is just approximately a normal distribution. B. That the true confidence level is not exactly 95%, it is instead just approximately 95%. C. That the endpoints of the interval are not exactly correct, they are only approximately correct. 2. Suppose Y1, Y2,, Y10 are iid coin toss outcomes, either 0 or 1. Thus each Yi has expected value = 0.5 and variance 2 = 0.25. The variance of (Y1 + Y2 + +Y10)/10 is A. 0.0005 B. 0.025 C. 0.25 D. 0.5 In the town/mountain lion analogy, the mountain lion represents and the town represents. A. the estimate, the estimand B. the estimand, the estimate C. confidence, probability D. probability, confidence When are frequentist and Bayesian conclusions similar? A. when the normal distribution is assumed B. when the data are free of outliers C. when the prior is vague D. when the data are iid

Quiz 21. Name: Closed book, notes and no electronic devices. Turn quiz over when done. 1. The permutation model is an example of A. an iid model B. a normally distributed model C. a null model D. a Bernoulli model 2. Suppose Y1, Y2,, Y16 ~iid p0(y), where the mean of any Yi is, and the variance of any Yi is 2. Let Y1 be the average of the first 8 observations, and let Y2 be the average of the last 8 observations. Then E( Y2 -Y 1 ) = A. 0 B. y2 - y 1 C. D. 1.375 3. Suppose Y1, Y2,, Y16 ~iid p0(y), where the mean of any Yi is, and the variance of any Yi is 2. Let Y1 be the average of the first 8 observations, and let Y2 be the average of the last 8 observations. Then Var( Y2 -Y 1 ) = A. 0 B. 2 /8 C. 2 /8 + 2 /8 = 2 /4 D. 2 4. What does a p-value of 0.032 mean? A. There is a 0.032 probability that the data are explained by chance alone B. There is a 0.032 probability of seeing a difference as extreme or more extreme than the observed difference, assuming the data are produced by a null model C. There is a 0.032 probability that the data are not explained by chance alone D. There is a 0.032 probability of seeing a difference that is less extreme than the observed difference, assuming the data are produced by a null model

Quiz 22. Name: Closed book, notes and no electronic devices. Turn quiz over when done. 1. Suppose a random variable V is produced by the chi-square distribution with 10 degrees of freedom. Then E(V) = A. 0 B. 1 C. 9 D. 10 n i i Y ( Y 2. Suppose Y1, Y2,, Yn ~iid N(, 2 1 ). What is the distribution of 2 A. N(0,1) B. N(, 2 2 2 ) C. n 1 D. n ) 2? 3. Why was the confidence interval for so wide in the failure time example? A. Because the normality assumption was violated B. Because the chi-squared assumption was violated C. Because the sample size was very small D. Because the sample size was very large Y 4. When is the distribution of the statistic exactly a t-distribution? ˆ / n A. When Y1, Y2,, Yn ~iid N(, 2 ) B. When Y1, Y2,, Yn ~iid Tdf C. When the sample size, n, is sufficiently large D. Never

Quiz 22. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. 1. (40) Suppose Y ~N(10,5 2 ). Let T = 2Y. What is the distribution of T? Answer in words, no symbols. 2 2 2. (40) Suppose Y 1, Y 2 ~ iid N(0,1). Let T = Y. What is the distribution of T? Answer in words, no symbols. 1 Y2

Quiz 23. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. The two-sample t test is used to compare means of two distinct groups. The data are Yij, where i denotes group (i = 1,2), and j indicates observation within group (j = 1,2,,ni). State the null model that you assume to have produced your data Yij when you use the twosample t test.

Quiz 24. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. Here is a partial data set to be used in a standard regression analysis (as discussed in the reading) to predict Y = price of a used car of a particular make and model, from X = age (in years) of the car. Obs X=age Y=price 1 10 Y 1 2 5 Y 2 3 12 Y 3 4 3 Y 4 State the null model for how Y 1, Y 2, Y 3, Y 4 are produced.

Quiz 25. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. The George HW Bush ratings (X) and the Barbara Bush ratings (Y) are as follows. Obs X=GeorgeHWBush Y=BarbBush 1 1 2 2 4 3 3 4 3 4 1 1 5 4 4 6 1 1 7 3 1 8 4 4 These data were used in a chi-square test in the reading. This test assumes a particular null model. Explain in words, without using symbols, the null model for how the BarbBush data (the Y data) were produced. Specifically, how were the numbers 2,3,4,1,4,1,1,4, produced, according to the null model?

Quiz 26. Name: Instructions: Closed book, notes and no electronic devices. Turn quiz over when done. In the chapter, data were used to test conformance with a standard. The standard was that the computer chip width data values look as if produced independently by the normal distribution having mean 310 and standard deviation 4.5. How will you assume the computer chip width data values are produced when you perform power analysis? Answer in words; do not use symbols.

Quiz 27. Name: Closed books, notes, and no electronic devices. Here is a data set. Obs y 1 10.0 2 0.0 3 3.4 4 327.3 213 3.4 * * * How do you produce a bootstrap sample y 1, y2,..., y213 from this data set?