First Midterm Examination Econ 103, Statistics for Economists February 14th, 2017

Similar documents
4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

SSEA Math 51 Track Final Exam August 30, Problem Total Points Score

Review Questions. Econ 103 Spring 2018

STAT 414: Introduction to Probability Theory

MAT 271E Probability and Statistics

Cornell University Final Examination December 11, Mathematics 105 Finite Mathematics for the Life and Social Sciences

Exam III #1 Solutions

ECE 302, Final 3:20-5:20pm Mon. May 1, WTHR 160 or WTHR 172.

Cornell University Final Examination December 11, Mathematics 105 Finite Mathematics for the Life and Social Sciences

Midterm 2 - Solutions

Econ 371 Problem Set #1 Answer Sheet

Problem # Number of points 1 /20 2 /20 3 /20 4 /20 5 /20 6 /20 7 /20 8 /20 Total /150

review session gov 2000 gov 2000 () review session 1 / 38

STAT 418: Probability and Stochastic Processes

Midterm Exam 1 Solution

STAT 31 Practice Midterm 2 Fall, 2005

18.05 Practice Final Exam

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

To find the median, find the 40 th quartile and the 70 th quartile (which are easily found at y=1 and y=2, respectively). Then we interpolate:

18.05 Final Exam. Good luck! Name. No calculators. Number of problems 16 concept questions, 16 problems, 21 pages

Lynch 2017 Page 1 of 5. Math 150, Fall 2017 Exam 1 Form A Multiple Choice

Ecn Analysis of Economic Data University of California - Davis February 23, 2010 Instructor: John Parman. Midterm 2. Name: ID Number: Section:

EECS 126 Probability and Random Processes University of California, Berkeley: Spring 2015 Abhay Parekh February 17, 2015.

PRACTICE PROBLEMS FOR EXAM 2

Swarthmore Honors Exam 2012: Statistics

Math 1: Calculus with Algebra Midterm 2 Thursday, October 29. Circle your section number: 1 Freund 2 DeFord

Bernoulli and Binomial Distributions. Notes. Bernoulli Trials. Bernoulli/Binomial Random Variables Bernoulli and Binomial Distributions.

MATH 3510: PROBABILITY AND STATS July 1, 2011 FINAL EXAM

Math 493 Final Exam December 01

X = X X n, + X 2

Statistics 100 Exam 2 March 8, 2017

6.041/6.431 Spring 2009 Quiz 1 Wednesday, March 11, 7:30-9:30 PM. SOLUTIONS


STUDENT NAME: STUDENT SIGNATURE: STUDENT ID NUMBER: SECTION NUMBER RECITATION INSTRUCTOR:

Discrete Mathematics and Probability Theory Summer 2014 James Cook Final Exam

Math 51 Midterm 1 July 6, 2016

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

18.05 Exam 1. Table of normal probabilities: The last page of the exam contains a table of standard normal cdf values.

LHS Algebra Pre-Test

CS 361 Sample Midterm 2

HKUST. MATH1013 Calculus IB. Directions:

STUDENT NAME: STUDENT SIGNATURE: STUDENT ID NUMBER: SECTION NUMBER RECITATION INSTRUCTOR:

Business Statistics Midterm Exam Fall 2015 Russell. Please sign here to acknowledge

Student s Printed Name: _Key

3.2 Probability Rules

Algebra 2 CP Semester 1 PRACTICE Exam

1 Basic continuous random variable problems

Exam 2 Review Math 118 Sections 1 and 2

Homework 2. Spring 2019 (Due Thursday February 7)

Midterm Exam 1 (Solutions)

Final Exam - Solutions

MA 1125 Lecture 33 - The Sign Test. Monday, December 4, Objectives: Introduce an example of a non-parametric test.

STAT 311 Practice Exam 2 Key Spring 2016 INSTRUCTIONS

Linear Equations. Find the domain and the range of the following set. {(4,5), (7,8), (-1,3), (3,3), (2,-3)}

Math st Homework. First part of Chapter 2. Due Friday, September 17, 1999.

Chapter 2 Random Variables

ECE 313: Conflict Final Exam Tuesday, May 13, 2014, 7:00 p.m. 10:00 p.m. Room 241 Everitt Lab

Math 151. Rumbos Fall Solutions to Review Problems for Exam 2. Pr(X = 1) = ) = Pr(X = 2) = Pr(X = 3) = p X. (k) =

Name: Exam 2 Solutions. March 13, 2017

MAT 271E Probability and Statistics

CS280, Spring 2004: Final

Lynch, October 2016 Page 1 of 5. Math 150, Fall 2016 Exam 2 Form A Multiple Choice Sections 3A-5A

STA 4321/5325 Solution to Extra Homework 1 February 8, 2017

P (E) = P (A 1 )P (A 2 )... P (A n ).

What is a random variable

MATH UN Midterm 2 November 10, 2016 (75 minutes)

Probabilistic Systems Analysis Spring 2018 Lecture 6. Random Variables: Probability Mass Function and Expectation

Student s Printed Name: KEY_&_Grading Guidelines_CUID:

6.2b Homework: Fit a Linear Model to Bivariate Data

Introduction to Probability 2017/18 Supplementary Problems

Business Statistics 41000: Homework # 5

Class 26: review for final exam 18.05, Spring 2014

MATH 3200 PROBABILITY AND STATISTICS M3200FL081.1

Exam #1 March 9, 2016

Math 447. Introduction to Probability and Statistics I. Fall 1998.

Lecture 13 (Part 2): Deviation from mean: Markov s inequality, variance and its properties, Chebyshev s inequality

Lynch 2017 Page 1 of 5. Math 150, Fall 2017 Exam 2 Form A Multiple Choice

Dynamic Programming Lecture #4

ELEG 3143 Probability & Stochastic Process Ch. 2 Discrete Random Variables

Business Statistics 41000: Homework # 2 Solutions

University of California, Berkeley, Statistics 134: Concepts of Probability. Michael Lugo, Spring Exam 1

1 Probability Theory. 1.1 Introduction

Without fully opening the exam, check that you have pages 1 through 10.

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam.

Question Points Score Total: 76

1 Basic continuous random variable problems

EXAM # 3 PLEASE SHOW ALL WORK!

PRACTICE PROBLEMS FOR EXAM 1

Test 2 - Answer Key Version A

Math 20 Spring Discrete Probability. Midterm Exam

Name: Firas Rassoul-Agha

Without fully opening the exam, check that you have pages 1 through 11.

18440: Probability and Random variables Quiz 1 Friday, October 17th, 2014

Summary of basic probability theory Math 218, Mathematical Statistics D Joyce, Spring 2016

EECS 126 Probability and Random Processes University of California, Berkeley: Fall 2014 Kannan Ramchandran September 23, 2014.

The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION COURSE I. Wednesday, August 16, :30 to 11:30 a.m.

Midterm 2. Your Exam Room: Name of Person Sitting on Your Left: Name of Person Sitting on Your Right: Name of Person Sitting in Front of You:

Student s Printed Name:

ECN221 Exam 1 VERSION B Fall 2017 (Modules 1-4), ASU-COX VERSION B

Transcription:

First Midterm Examination Econ 103, Statistics for Economists February 14th, 2017 You will have 70 minutes to complete this exam. Graphing calculators, notes, and textbooks are not permitted. I pledge that, in taking and preparing for this exam, I have abided by the University of Pennsylvania s Code of Academic Integrity. I am aware that any violations of the code will result in a failing grade for this course. Name: Signature: Student ID #: Recitation #: Question: 1 2 3 4 5 6 Total Points: 20 20 20 20 20 40 140 Score: Instructions: Answer all questions in the space provided, continuing on the back of the page if you run out of space. Show your work for full credit but be aware that writing down irrelevant information will not gain you points. Be sure to sign the academic integrity statement above and to write your name and student ID number on each page in the space provided. Make sure that you have all pages of the exam before starting. Warning: If you continue writing after we call time, even if this is only to fill in your name, twenty-five points will be deducted from your final score. In addition, a point will be deducted for each page on which you do not write your name and student ID.

Econ 103 Midterm I, Page 2 of 8 February 14th, 2017 The following question appeared on the homework assignment to accompany Lecture #4: 20 1. What value of a minimizes n i=1 (y i a) 2? Prove your answer. Solution: Differentiating with respect to a, we find that the first-order condition is n 2 (y i a) = 0 Rearranging and splitting up the sum, this is equivalent to i=1 n y i = i=1 n a i=1 The sum on the right-hand side equals na. Dividing through by n gives a = ȳ. 20 2. I have three boxes. The first contains two red balls, the second contains two blue balls, and the third contains one red ball and one blue ball. I choose a box at random so that each box is equally likely to be selected. I then reach inside and pull out a ball: it s red. Calculate the probability that the other ball in the box is also red. To help us grade your answer more easily, let B 1 be the event that I chose the first box, B 2 the second box, and B 3 the third box. Let R be the event that I drew a red ball. Solution: The problem statement specifies P (B 1 ) = P (B 2 ) = P (B 3 ) = 1/3 along with P (R B 1 ) = 1, P (R B 2 ) = 0, and P (R B 3 ) = 1/2. We are asked to calculate the probability that the other ball is red given that the ball I drew is red. This is equivalent to asking for the probability of B 1 given R. By Bayes Rule For the denominator, P (B 1 R) = P (R B 1)P (B 1 ) P (R) P (R) = P (R B 1 )P (B 1 ) + P (R B 2 )P (B 2 ) + P (R B 3 )P (B 3 ) = 1 1/3 + 0 1/3 + 1/2 1/3 = 1/2 Therefore, P (B 1 R) = (1/3)/(1/2) = 2/3. 3. Suppose we have a dataset (x 1, y 1 ),..., (x n, y n ) of scores for n students on two exams: x denotes exam #1 and y denotes exam #2. Let s xy denote the sample covariance between

Econ 103 Midterm I, Page 3 of 8 February 14th, 2017 exam scores, r denote the sample correlation, s x and s y the respective sample standard deviations, and x and ȳ the respective sample means. Let ŷ = a + bx where a is the intercept and b the slope of a linear regression calculated from this dataset. 2 (a) Write down the formula for a in terms of b, x, and ȳ. Solution: a = ȳ b x 2 (b) Write down the formula for b in terms of s xy and s x. Solution: b = s xy /s 2 x 2 (c) Write down the formula for r in terms of s xy, s x and s y Solution: r = s xy /(s x s y ) 6 (d) Suppose that b > r. Then which is larger: s x or s y? Explain. Solution: Taking the ratio of the formulas from (b) and (c) we have b/r = s y /s x so if b > r then s y > s x. 8 (e) Suppose the sample mean score on the first exam, x, was 75% while the sample mean score on the second exam, ȳ, was 85%. Moreover, our regression model predicts that someone who got 50 on the first exam will get 70 on the second. Calculate b. Solution: When x = 50 we have ŷ = 70. Thus 70 = a + 50b. Using part (a), and combining these: Solving, b = 15/( 25) = 3/5 = 0.6. a = ȳ b x = 85 75b 70 = (85 75b) + 50b = 85 25b 20 4. Veronica wants to know the fraction of undergraduates who drink underage. Because she worries that her subjects may be unwilling to admit to breaking the law, Veronica conducts her interviews as follows: Hi, I m carrying out a survey on underage drinking. In a moment I m going to ask if you drink underage, but before I do I d like you to flip this fair coin in secret. If you get heads, answer my question truthfully. If you get tails,

Econ 103 Midterm I, Page 4 of 8 February 14th, 2017 answer YES regardless of whether you drink underage. Because I don t know the outcome of your coin flip, if you answer YES I ll never know whether you really drink underage of just flipped a tails. Veronica interviews a large number of subjects who all follow her instructions exactly: 85% answer YES. Calculate the fraction of undergraduates who drink underage. Solution: Let Y be the event that someone answers YES, D be the event that he drinks underage, and T be the event that he flips a tails. Since Y = D T, by the addition rule we have P (Y ) = P (D) + P (T ) P (D T ) Since the coin is fair, P (T ) = 0.5. Since the outcome of a coin toss doesn t depend on whether someone does or doesn t drink underage, D and T are independent which implies P (D T ) = P (D)P (T ). Substituting this information along with P (Y ) = 0.85, 0.85 = P (D) + 0.5 0.5 P (D) and solving we see that P (D) = 0.7. Alternatively, you could solve this using the Law of Total Probability: P (Y ) = P (Y T )P (T ) + P (Y T c )P (T c ) 0.85 = 1 1/2 + P (Y T c ) 1/2 0.7 = P (Y T c ) using the fact that P (Y T c ) = P (D) since anyone who flips heads answers truthfully, and the outcome of the coin flip has no bearing on whether someone drinks underage.

Econ 103 Midterm I, Page 5 of 8 February 14th, 2017 5. Let X be a random variable with support set { 1, 1}, p(1) = q, and p( 1) = 1 q. 5 (a) Write down the CDF of X. Solution: 0, x 0 < 1 F (x 0 ) = 1 q, 1 x 0 < 1 1, x 0 1 3 (b) Calculate E[X]. Solution: E[X] = 1 (1 q) + 1 q = 2q 1 3 (c) Calculate E[X 2 ]. Solution: E[X 2 ] = (1 q) ( 1) 2 + q 1 2 = (1 q) + q = 1 3 (d) Calculate V ar(x). Solution: By the shortcut rule: V ar(x) = E[X 2 ] E[X] 2 = 1 (2q 1) 2 = 1 (4q 2 4q + 1) = 4q(1 q) Another way to solve this is to notice that we can write X as a linear transformation of a Bernoulli RV. In particular if Z Bernoulli(q) then X = 2Z 1 and hence V ar(x) = 4V ar(z) = 4q(1 q) as we calculated directly. 6 (e) Let X 1 and X 2 be independent RVs both of which have the same pmf as X, defined in the problem statement. Write out the support set and pmf of Y = X 1 + X 2. Solution: The support set is { 2, 0, 2} and the pmf is p( 2) = P (X 1 = 1 X 2 = 1) = (1 q) 2 p(0) = P (X 1 = 1 X 2 = 1) + P (X 1 = 1 X 2 = 1) = 2q(1 q) p(2) = P (X 1 = 1 X 2 = 1) = q 2 6. This question concerns a data.table called star. Here are its first and last five rows:

Econ 103 Midterm I, Page 6 of 8 February 14th, 2017 race classtype yearssmall hsgrad g4math g4reading 1: 1 3 0 NA NA NA 2: 2 3 0 NA 706 661 3: 1 3 0 1 711 750 4: 2 1 4 NA 672 659 5: 1 2 0 NA NA NA --- 6321: 2 2 0 NA NA NA 6322: 1 2 0 NA NA NA 6323: 2 2 0 0 NA NA 6324: 1 1 1 NA NA NA 6325: 1 1 1 NA NA NA The data come from a randomized controlled experiment carried out in Tennessee in the 1980s. Students were randomly assigned to one of three class types, as indicated by the column classtype: 1 = small class, 2 = regular-sized class, 3 = regular-sized class with a teacher s aid. In the years that followed, researchers measured several outcomes: hsgrad is an indicator that takes the value 1 if a student graduated high school and 0 if not, g4math and g4reading are the student s scores on standardized 4 th grade reading and math tests. In this question we will not work with the columns race or yearssmall. 5 (a) Say that the data are stored at http://data.com/star.csv. Give the full set of commands needed to read star.csv into R as a data.table called star. Solution: library(data.table) star <- fread("http://data.com/star.csv") 5 (b) Give the R command needed to create a new data.table called small that only contains students who were randomly assigned to a small class. Solution: There are many possible answers, such as small <- star[classtype == 1] or small <- subset(star, classtype == 1) 5 (c) Write R code to carry out a linear regression that predicts a student s reading score from her math score. Use the data for all students in the experiment.

Econ 103 Midterm I, Page 7 of 8 February 14th, 2017 Solution: lm(g4reading ~ g4math, data = star) or star[, lm(g4reading ~ g4math)] 5 (d) Here are the results of running the command from the previous part: Coefficients: (Intercept) g4math 317.5765 0.5696 If Beatrice scored 10 points higher in math than Alejandro, how many points higher would we predict that Beatrice will score in reading? Solution: 10 times the coefficient on g4math, or about 5.7 points higher. 5 (e) Write R code to calculate the average high school graduation rate separately for students assigned to each different kind of class. Be sure to properly account for missing values. Solution: There are many possible answers. output in the following section was: The one used to generate the star[,.(grad_rate = mean(hsgrad, na.rm = TRUE)), keyby = classtype] 5 (f) The results of the preceding part were as follows: classtype grad_rate 1: 1 0.8359202 2: 2 0.8251619 3: 3 0.8392857 Based on these results, do smaller class sizes improve high school graduation rates relative to regular-sized classes? Explain in no more than three sentences. Solution: Students randomly assigned to the small classes (classtype 1) have a slightly higher graduation rate than those assigned to the regular classes (classtype 2) namely 83.6% versus 82.5%. On the other hand, students assigned to regular classes with a teacher s aid (classtype 3) have an even higher graduation rate: 83.9%. So it may not be the class size per se that matters. On the whole the differences are small.

Econ 103 Midterm I, Page 8 of 8 February 14th, 2017 10 (g) Write an R function called mycor that takes two input arguments, a numeric vector x and another numeric vector y, and returns the sample correlation between then. In your answer you may use any R commands you like except the built-in correlation command cor. For simplicity you may assume that neither x nor y contains any missing values. Solution: There are many possible answers. Here s one possibility: mycor <- function(x, y){ s_xy <- cov(x, y) s_x <- sd(x) s_y <- sd(y) return(s_xy / (s_x * s_y)) }