Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power)

Size: px
Start display at page:

Download "Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power)"

Transcription

1 Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power) 1. A researcher plans to do a two-tailed hypothesis test with a sample of n = 100 people and a significance level of α =.05. For each change, describe how it would affect (a) the probability of Type I Error, (b) the probability of Type II error, and (c) the power of the test. a. changing to a significance level of α =.01 decrease probability of Type I Error, increase probability of Type II Error, decrease power of the test b. changing to a one-tailed test with a significance level of α =.05 no effect on probability of Type I Error, and assuming the researcher had been correct about the direction of the effect it would decrease the probability of Type II Error and increase power c. increasing the sample size to n = 500 no effect on probability of Type I Error, decrease probability of Type II Error, increase power d. (outside of the experimenter s control), the difference between the true population mean and the null hypothesis increased no effect on probability of Type I Error, decrease probability of Type II Error, increase power e. (outside of the experimenter s control), the standard deviation of the population increased no effect on probability of Type I Error, increase probability of Type II Error, decrease power note: more generally, the only thing that affects Type I Error is alpha and that anything that increases the probability of Type II Error will decrease power (since these probabilities must sum to 1) 2. For each scenario, find the sample mean that corresponds to the z-statistic. (You also might want to think about whether this procedure would change if you were instead finding a sample mean corresponding to a t-statistic). The procedure would not change for a sample mean corresponding to a t-statistic. a. μ 0 = 100, σ = 20, n = 17, z = 1.24 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (20 / 17) = b. μ 0 = 100, σ = 20, n = 17, z = x = μ 0 + z * σ x = μ 0 + z * (σ / n) = (-.83) * (20 / 17) = c. μ 0 = 100, σ = 10, n = 17, z = 1.24 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (10 / 17) = d. μ 0 = 100, σ = 20, n = 100, z = 1.24

2 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (20 / 100) = e. μ 0 = 120, σ = 20, n = 17, z = 1.24 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (20 / 17) = Take a moment to compare your answers for a-e, and notice how each change from the scenario in a changes the answer and why. f. μ 0 = -20, σ = 3, n = 6, z = x = μ 0 + z * σ x = μ 0 + z * (σ / n) = (-2.30) * (3 / 6) = g. μ 0 = -86, σ = 1000, n = 25, z =.64 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (1000 / 25) = 42 h. μ 0 = 337, σ = 7, n = 1000, z = 2.00 x = μ 0 + z * σ x = μ 0 + z * (σ / n) = * (7 / 1000) = i. μ 0 = 0, σ = 10, n = 200, z = x = μ 0 + z * σ x = μ 0 + z * (σ / n) = 0 + (-1.00) * (10 / 200) = For each set of values, calculate the probability that we would observe sample data that would allow us to reject the null hypothesis that µ = μ 0, if in fact the true population mean is µ = μ guess. note: making a sketch for these problems is highly recommended, and is the best way to keep track of whether you are looking for probability in the upper vs. lower tail also note, there was no value for α or instructions for a one vs. two tailed test in the version of the problem set that was originally posted when in doubt, for this class, use α =.05 and a two-tailed test a. μ 0 = 100, σ = 20, n = 25, μ guess = 110 i. first, figure out x critical (which is in reference to the distribution under μ 0 ). we ll reject H 0 if z < or z > since μ guess is > μ 0, we re interested the sample mean, x critical, that corresponds to z critical = x critical = μ 0 + z critical * σ x = μ 0 + z critical * σ / n = * 20 / 25 = ii. second, figure out the z-value that corresponds to x critical in the distribution under μ guess, and calculate a probability of exceeding that z-value. in this case we are interested in finding a sample mean that gives us a z-statistic greater than this z-value, since μ guess is > μ 0 z = (x critical μ guess ) / σ x = (x critical μ guess ) / (σ / n) = ( ) / (20 / 25) = -.54 p(z > -.54) = pnorm(-.54, lower.tail=false) =.71 this tells us that if the true population mean is actually 110 (and the true population standard deviation is actually 20), and we sample n = 25 people, we will have a 71% chance of correctly rejecting H 0 that μ = 100 b. μ 0 = 50, σ = 1, n = 25, μ guess = 49 i. first, figure out x critical (which is in reference to the distribution under μ 0 ).

3 we ll reject H 0 if z < or z > since μ guess is < μ 0, we re interested the sample mean, x critical, that corresponds to z critical = x critical = μ 0 + z critical * σ x = μ 0 + z critical * σ / n = * 1 / 25 = ii. second, figure out the z-value that corresponds to x critical in the distribution under μ guess, and calculate a probability of exceeding that z-value. in this case we are interested in finding a sample mean that gives us a z-statistic less than this z-value, since μ guess is < μ 0 z = (x critical μ guess ) / σ x = (x critical μ guess ) / (σ / n) = ( ) / (1 / 25) = 3.05 p(z < 3.05) = pnorm(3.05, lower.tail=true) =.999 this tells us that if the true population mean is actually 49 (and the true population standard deviation is actually 1), and we sample n = 25 people, we will have a 99.9% chance of correctly rejecting H 0 that μ = 50 c. μ 0 = -25, σ = 10, n = 36, μ guess = -27 i. first, figure out x critical (which is in reference to the distribution under μ 0 ). we ll reject H 0 if z < or z > since μ guess is < μ 0, we re interested the sample mean, x critical, that corresponds to z critical = x critical = μ 0 + z critical * σ x = μ 0 + z critical * σ / n = * 10 / 36 = ii. second, figure out the z-value that corresponds to x critical in the distribution under μ guess, and calculate a probability of exceeding that z-value. in this case we are interested in finding a sample mean that gives us a z-statistic less than this z-value, since μ guess is < μ 0 z = (x critical μ guess ) / σ x = (x critical μ guess ) / (σ / n) = ( (-27)) / (10 / 36) = -.76 p(z < -.76) = pnorm(-.76, lower.tail=true) =.22 this tells us that if the true population mean is actually -27 (and the true population standard deviation is actually 10), and we sample n = 36 people, we will have a 22% chance of correctly rejecting H 0 that μ = -25 d. μ 0 = 0, σ = 1000, n = 16, μ guess = 200 i. first, figure out x critical (which is in reference to the distribution under μ 0 ). we ll reject H 0 if z < or z > since μ guess is > μ 0, we re interested the sample mean, x critical, that corresponds to z critical = x critical = μ 0 + z critical * σ x = μ 0 + z critical * σ / n = * 1000 / 16 = 490 ii. second, figure out the z-value that corresponds to x critical in the distribution under μ guess, and calculate a probability of exceeding that z-value. in this case we are interested in finding a sample mean that gives us a z-statistic greater than this z-value, since μ guess is > μ 0 z = (x critical μ guess ) / σ x = (x critical μ guess ) / (σ / n) = ( ) / (1000 / 16) = 1.16 p(z > 1.16) = pnorm(1.16, lower.tail=false) =.12

4 this tells us that if the true population mean is actually 200 (and the true population standard deviation is actually 1000), and we sample n = 36 people, we will have a 12% chance of correctly rejecting H 0 that μ = 0 You might notice that the power in the last two scenarios is very poor! Even if the effect exists, we most likely wouldn t be able to detect it with the current study design, so we might wonder, why even do the study at all? In these scenarios the sample size is the only thing that is under the control of the researcher / statistician, so they might decide to increase their sample size or to not invest their time and energy in pursuing this particular question, since the odds are low that they will make the correct decision in their test (if their assumptions about the true population mean and standard deviation are correct). 4. A team captain reads that if you start with a penny that is facing tails up, it has a 51% chance of landing facing tails up. He is wondering if this is true, because if it is, he should use this strategy when calling the coin toss at the beginning of games. He flips a coin 10 times (starting with tails up) and finds that he does not have enough evidence to reject the null hypothesis that π =.50. Why is the power of the test important here, and how might understanding power influence his conclusions? We could do a formal power calculation here, but we can notice right away that the power of this test is likely very low. Even if the true process proportion is π =.51, this is going to yield a distribution of sample proportions that looks a lot like the distribution of sample proportions we would see with a process proportion of π =.50, so we would need a very large sample size (much greater than 10), to have a good chance of detecting this effect. By understanding this, we can realize that failing to reject H 0 doesn t tell us very much here, because we we probably weren t going to be able to reject H 0 even if π =.51. (And this is one of the primary reasons why if we fail to reject H 0 we do not conclude that this means that H 0 is true). (This is beyond the scope of the material for this class, but if you wanted to do a formal power calculation you could use qbinom() or some trial and error with dbinom() to figure out that we would need to observe either 9 or 10 successes to reject H 0 using α =.05, two-tailed (or one-tailed, it turns out). You could then use pbinom() or dbinom() to ask about the probability of observing 9 successes if π =.51, and you would find that the power of the test is.013. In other words, he only has a 1.3% chance of detecting the effect if it exists (i.e., a 1.3% chance of rejecting H 0 and finding evidence that π >.50) so failing to reject the null hypothesis is not informative here. You might also then notice that he has a.91% chance of getting 0 or 1 tails, which would lead him to reject H 0 and conclude that the true population proportion, π, is less than.50 (he would conclude that if the coin starts on tails it is most likely to land on heads). This is not a good situation to be in, since his power to draw a correct conclusion is extremely close to his probability of drawing a very incorrect conclusion.

5 Part 2: Required sample sizes for desired margin of error 1. (Adapted from Tintle 3.CE.5). The margin of error for a 95% confidence interval for a process (or population) probability, π, can be approximated using 1 / n. a. Calculate what the margin of error for a 95% confidence interval will be for the following sample sizes: 100, 400, 1000, 2000, 8000, (Note, vector operations in R would be very useful here). > n <- c(100, 400, 1000, 2000, 8000, 9000) > margin <- 1 / sqrt(n) > margin [1] b. Sketch your results by hand or plot your results in R (make sure to define the x axis and the y axis if you use the plot() function), and describe how the margin of error changes with the sample size. > plot(n, margin) c. In order to cut the margin of error in half, by how many times must you increase our sample size? we want (bigger margin of error) = 2 * (smaller margin of error) 1/ n bigger_margin = 2 * 1/ n smaller_margin 1/ n bigger_margin = 2 * 1/ n smaller_margin (1/ n bigger_margin ) 2 = (2 * 1/ n smaller_margin ) 2 1/n bigger_margin = 4 * 1/n smaller_margin n smaller_margin ) = 4 * n bigger_margin we need to multiply our sample size by 4 in order to cut the margin of error in half d. Which would have a bigger impact on the margin of error: increasing the sample size from 100 to 400 or increasing the sample size from 8000 to 9000? (And explain why). Increasing the sample size from 100 to 400 is a larger relative increase (4x) than increasing the sample size from 8000 to 9000 (1.125x), and will have a bigger impact on the margin of error. Because the margin of error depends on 1 / n, we get less bang for our buck when we increase n once n is already large (diminishing returns). 2. For each scenario, compute the required sample size to get a confidence interval with at most the specified margin of error. (Remember that we are only considering confidence intervals for two-tailed tests). Note that this question is referring to

6 confidence intervals for single means this wasn t explicitly stated in the original version of the problem set, but the standard deviations and the sizes of the margins of error are too large to be referring to proportions, which are bounded between 0 and 1. - Also note that σ indicates we should be using z-statistics (known population standard deviation), and s indicates that we should be using t-statistics (estimated population standard deviation). However, because we cannot know df for a t- statistic without knowing the sample size, we use a z-statistic for a sample size calculation for a t-test, justified by the fact that required sample size is typically large enough that there is minimal difference between the distribution of t df=required_sample_size and the distribution of z - Also note that we always round up for required sample size calculations, because we want to know how many observations we need to sample (once, not on average in the long-run, so we need a whole number) to get at at most a margin of error, and increasing n decreases the margin of error a. σ = 10, α =.05, margin of error of at most 6 margin = z critical, α=.05 * σ x = z critical, α=.05 * σ / n n = z critical, α=.05 * σ / margin n = (z critical, α=.05 * σ / margin) 2 n = (1.96 * 10 / 6) 2 n = 10.67, round up to n = 11 b. σ = 20, α =.05, margin of error of at most 6 margin = z critical, α=.05 * σ x = z critical, α=.05 * σ / n n = z critical, α=.05 * σ / margin n = (z critical, α=.05 * σ / margin) 2 n = (1.96 * 20 / 6) 2 n = 42.69, round up to n = 43 c. σ = 10, α =.01, margin of error of at most 6 margin = z critical, α=.01 * σ x = z critical, α=.01 * σ / n n = z critical, α=.01 * σ / margin n = (z critical, α=.01 * σ / margin) 2 n = (2.58 * 10 / 6) 2 n = 18.49, round up to n = 19 d. s = 20, α =.01, margin of error of at most 2 margin = z critical, α=.01 * s x = z critical, α=.01 * s / n n = z critical, α=.01 * s / margin n = (z critical, α=.01 * s / margin) 2 n = (2.58 * 20 / 2) 2 n = , round up to n = 666 e. s = 15, α =.10, margin of error of at most 3 margin = z critical, α=.10 * s x = z critical, α=.10 * s / n n = z critical, α=.10 * s / margin n = (z critical, α=.10 * s / margin) 2 n = (1.64 * 15 / 3) 2

7 n = 67.24, round up to n = 68 Part 3: Sampling, causality, and generalization Screenshots of relevant problems available on Canvas in the Files section. Tintle et al., Chapter 3: a. the coffee bars might have their longest waiting times at different times of the day (due to different customer patterns, staffing patterns, etc). b. you might want to visit each coffee bar at multiple times throughout the day, or stay at each coffee bar for the entire day and sample customers at random throughout the day this question has a double negative that makes giving a yes/no answer confusing yes, the options are more clear, and rather than giving a yes/no answer, the respondent needs to clearly state their opinion we would expect assistance to the poor to give a more favorable response rate toward these programs responses will vary, there are many correct answers here people might be uncomfortable answering no to an interviewer who is smoking (on the other hand, they might be particularly bothered by the smoke and become more inclined to say yes ) answers will vary and there are reasonable explanations for a number of answers, but standard answers would be: a: underestimate or overestimate (people are generally not great at estimating how much sleep they get), b. overestimate or overstate, c. overstate, d. overestimate or overstate, e. overstate Tintle et al., Chapter 4: the weather is colder and snowier during these months, and these weather patterns might also be related to heart attacks (you might also generate other explanations) no, number of TVs is likely a proxy for wealth or development in a country, which is linked to life expectancy because we don t think there is a cause-and-effect relationship between number of TVs and life expectancy, changing the number of TVs in a country would not change life expectancy a. explanatory (grouping) variable is Mediterranean diet vs. no Mediterranean diet, and the response variable is memory and cognitive skills. b. there are many possible explanations here one is that people who choose a Mediterranean diet might tend to be wealthy (a Mediterranean diet tends to be expensive) or might share other socioeconomic similarities that are also linked to memory and cognitive skills a. teenagers, b. the explanatory (grouping) variable is whether or not the teenager eats family dinners, and the response variable is the teenager s drug use, c. there are many possible explanations here one is that family dinners require parents to be home during dinner time, and the feasibility of this is linked to socioeconomic status (as is drug use), another is consider parents drug use or other genetic factors that might both influence teenage drug use and the presence or absence of family dinners a. the explanatory (grouping) variable is children vs. no children, which is categorical, b. the response variable is life span (lifetime), which is quantitative, c. the

8 dotplots suggest that the central tendency for lifetime is higher for men with children, d. however, the dotplots also highlight that many men do not have children until they reach a certain age, and so by definition we might expect men who have children to be older than men who did not have children, and thus if we take a sample of men who have died, we would also expect men who have children to be older than men who do not have children a (c is very important, but it relates to how you sample the larger pool of participants, before randomly assigning them to group) c a b (it is the only scenario in which we wouldn t expect non-random pre-existing differences between the two groups) a (it is the only scenario in which we wouldn t expect non-random pre-existing differences between the two groups) Part 4: Comparing two proportions HIV and AZT (adapted from Tintle et al.). In a 1994 study, 164 pregnant, HIV-positive women were randomly assigned to receive the drug AZT during pregnancy and 160 pregnant, HIV-positive women were randomly assigned to a control group that received a placebo. They found that 40 of the mothers in the control group gave birth to babies who were HIV-positive, compared to only 13 in the AZT group. a. Is this an observational study or an experiment? experiment, because women were randomly assigned to group b. Identify the grouping variable and the response variable grouping variable is AZT vs. placebo, response variable whether there was an HIV-positive birth c. Construct a 2 x 2 table of frequencies (counts), with the grouping variable in columns. Fill in the marginal frequencies too. AZT PLACEBO TOTAL HIV-POSITIVE BIRTH = 53 HIV-NEGATIVE BIRTH = = = 271 TOTAL = 324 d. Calculate p(born HIV-positive placebo), p(placebo born HIV-positive), p(born HIV-positive AZT), and p(placebo not born HIV-positive). Which of these two would we compare if we wanted to examine the effectiveness of AZT in preventing HIV-positive births? p(born HIV-positive placebo) = 40 / 160 =.25 p(placebo born HIV-positive) = 40 / 53 =.75 [note, it is coincidence that the first two conditional probabilities sum to 1, this is not guaranteed] p(born HIV-positive AZT) = 13 / 164 =.079 p(placebo not born HIV-positive) = 120 / 271 =.44 if we wanted to compare the effectiveness of AZT and placebo in preventing HIV-positive births, we would want to compare the proportion of HIV-

9 positive births in the AZT vs. placebo groups, i.e., p(born HIV-positive placebo) and p(born HIV-positive AZT) e. Compare the central tendency and variability of the placebo and AZT groups. The mode of each group is HIV-negative birth. The AZT group is less variable than the placebo group (relative frequency at the mode is higher,.92 vs..75) f. Calculate the difference in sample proportions of HIV-positive births, p placebo p ÂZT (note that, counterintuitively, this means we are labelling an HIV-positive birth a success ) p placebo = p(born HIV-positive placebo) = 40 / 160 =.25 p ÂZT = p(born HIV-positive AZT) = 13 / 164 =.079 p placebo p ÂZT = =.171 g. State two hypotheses about the difference in population proportions, π placebo - π AZT, with the null hypothesis stating there is no relationship between placebo/azt and HIV-positive births. Use two-tailed hypotheses. H 0 : = π placebo - π AZT = 0 H A : = π placebo - π AZT 0 h. Use the Two Proportion applet to simulate a distribution of differences in sample proportions that we could observe if there was no relationship between placebo/azt and HIV-positive births. Use this distribution to compute the probability of observing a sample difference as or more extreme as our sample difference, if there was no relationship between the two variables. (Or describe how you could simulate a single instance of a difference in sample proportions). Exact answer will vary from simulation to simulation with the applet. Please see us in office hours for help using the applet that we used in lecture. The applet simulates the following procedure: you could shuffle 53 red cards and 271 blue cards, and randomly deal them into piles of 164 and 160, and calculate the difference in proportion of red cards in the two piles. i. Use a theory-based method to i. Describe the mean and standard deviation of the distribution of sample proportions that we could observe if there was no relationship between placebo/azt and HIV-positive births μ p _placebo p _AZT = 0 σ p _placebo p _AZT = (π pooled * (1-π pooled )) / n 1 + (π pooled * (1-π pooled )) / n 2 )) calculate, π pooled = 53 / 324 =.16 σ p _placebo p _AZT = (.16 * (1-.16)) / (.16 * (1-.16)) / 160)) σ p _placebo p _AZT =.041 (not asked, but we also assume the distribution is a normal distribution, which holds as long as the distributions of individual group proportions are normally distributed) ii. Describe the location of our observed difference in sample proportions (p placebo p ÂZT ) in this distribution, using a z-statistic

10 z = ((p placebo p ÂZT ) - μ p _placebo p _AZT ) / σ p _placebo p _AZT z = (.171 0) /.041 = 4.17 iii. Use R to calculate the two-tailed p-value for that z-statistic, and test the null hypothesis using a significance level of α =.05. p(z < or z > 4.17) = pnorm(-4.17) * 2 = j. Reflect on the choice of a two-tailed test with α =.05. Why might a twotailed test have been a good idea here? Thinking about the consequences of potential Type I and a Type II Errors in this scenario, do you have an argument for adjusting α? It would be very important to know if AZT led to more or fewer HIV-positive births than a placebo, and so we are interested in making an inference regardless of the direction of our observed sample difference. A Type I Error would be erroneously detecting a relationship between AZT and HIV-positive births when one does not exist. A Type II Error would be concluding that we do not have enough evidence for concluding that there is a relationship between AZT and HIV-positive births when one does actually exist. If we were especially concerned about Type I Error we would want to decrease alpha, and if we were especially concerned about Type II Error we would want to increase alpha. k. Calculate the measures of effect size of relative risk and number needed to treat, and interpret what they mean in this situation. relative risk = p larger / p ŝmaller =.25 /.079 = Based on our best estimate from our sample proportions, it seems that a woman taking a placebo is 3.16 times as likely to have an HIV-positive birth relative to a woman taking AZT. NNT = 1 / p placebo p ÂZT = 1 / = Based on our best estimate from our sample proportions, we would, on average, need to give 5.85 women AZT instead of a placebo to prevent a single HIV-positive birth. l. Can we draw cause-and-effect conclusions from this study, i.e., does taking the drug AZT during pregnancy cause a reduction in HIV-positive births? Yes, because women were randomly assigned to take AZT versus a placebo, so we don t need to worry that the two groups differ with respect to other, confounding variables (beyond what we would expect by random chance). Praising children (adapted from Tintle et al.). Psychologists investigated whether praising a child s intelligence rather than praising his / her effort, tends to have negative consequences such as undermining their motivation (Mueller and Dweck, 1998). Children participating in the study were given a set of problems to solve. After the first set of problems children were randomly assigned to be praised for either their intelligence or their effort. The children then were given another set of problems to solve and later told how many they got right. They were then asked to share the number that they got right with other students. Some of the children misrepresented (i.e., lied about) how many they got right. Of 59 children, 11 were praised for intelligence and misrepresented their score, 4

11 were praised for effort and misrepresented their score, 18 were praised for intelligence and did not misrepresent their score, and 26 were praised for effort and did not misrepresent their score. Researchers were interested in learning whether there was a difference in the proportion of children who lied, depending how they were praised. a. Is this an observational study or an experiment? experiment, because children were randomly assigned to receive intelligence or effort praise b. Identify the grouping variable and the response variable the grouping variable is type of praise and the response variable is whether or not children misrepresented their score c. Construct a 2 x 2 table of frequencies (counts), with the grouping variable in columns. Fill in the marginal frequencies too. INTELLGENCE EFFORT TOTAL DID MISREPRESENT = 15 DID NOT MISREPRESENT = 44 TOTAL = = d. Calculate p(misrepresented score intelligence praise), p(misrepresented score effort praise), p(praised for intelligence misrepresented score), and p(praised for intelligence did not misrepresent score). Which of these two would we compare if we wanted to examine the effect of type of praise on misrepresenting scores? p(misrepresented score intelligence praise) = 11 / 29 =.38 p(misrepresented score effort praise) = 4 / 30 =.13 p(praised for intelligence misrepresented score) = 11 / 15 =.73 p(praised for intelligence did not misrepresent score) = 18 / 44 =.41 if we wanted to compare how intelligence vs. effort influences misrepresentation of scores, we would want to compare the proportion of children who misrepresent their scores in the intelligence vs. effort groups, i.e., p(misrepresented score intelligence praise) and p(misrepresented score effort praise) e. Compare the central tendency and variability of the effort and intelligence praise groups. In both groups the mode is did not misrepresent score, but the intelligence praise group has greater variability than the effort praise group (relative frequency at the mode =.62 vs..87) Calculate the difference in sample proportions of misrepresentation of scores, p întelligence p êffort (note that, counterintuitively, this means we are labelling a misrepresentation a success ) p întelligence = p(misrepresented score intelligence praise) = 11 / 29 =.38 p êffort = p(misrepresented score effort praise) = 4 / 30 =.13 p întelligence p êffort = =.25 m. State two hypotheses about the difference in population proportions, π intelligence π effort, with the null hypothesis stating there is no relationship

12 between type of praise and misrepresentation of scores. Use two-tailed hypotheses. H 0 : = π intelligence π effort = 0 H A : = π intelligence π effort 0 f. Use the Two Proportion applet to simulate a distribution of differences in sample proportions that we could observe if there was no relationship between type of praise and misrepresentation of scores. Use this distribution to compute the probability of observing a sample difference as or more extreme as our sample difference, if there was no relationship between the two variables. (Or describe how you could simulate a single instance of a difference in sample proportions). Exact answer will vary from simulation to simulation with the applet. Please see us in office hours for help using the applet that we used in lecture. The applet simulates the following procedure: you could shuffle 15 red cards and 44 blue cards, and randomly deal them into piles of 29 and 30, and calculate the difference in proportion of red cards in the two piles. g. Use a theory-based method to i. Describe the mean and standard deviation of the distribution of sample proportions that we could observe if there was no relationship between type of praise and misrepresentation of scores μ p _intelligence p _effort = 0 σ p _intelligence p _effort = (π pooled * (1-π pooled )) / n 1 + (π pooled * (1-π pooled )) / n 2 )) calculate, π pooled = 15 / 59 =.25 σ p _intelligence p _effort = (.25 * (1-.25)) / 29 + (.25 * (1-.25)) / 30)) σ p _intelligence p _effort =.11 (not asked, but we also assume the distribution is a normal distribution, which holds as long as the distributions of individual group proportions are normally distributed) ii. Describe the location of our observed difference in sample proportions (p întelligence p êffort ) in this distribution, using a z-statistic z = ((p întelligence p êffort ) - μ p _intelligence p _effort ) / σ p _intelligence p _effort z = (.25 0) /.11 = 2.27 iii. Use R to calculate the two-tailed p-value for that z-statistic, and test the null hypothesis using a significance level of α =.05. p(z < or z > 2.27) = pnorm(-2.27) * 2 =.023 h. Calculate the measures of effect size of relative risk and number needed to treat, and interpret what they mean in this situation. relative risk = p larger / p ŝmaller =.38 /.13 = Based on our best estimate from our sample proportions, it seems that a child praised for intelligence is 2.92 times as likely to misrepresent their score relative to a child praised for effort. NNT = 1 / p întelligence p êffort = 1 / = Based on our best estimate from our sample proportions, we would, on average, need to praise 4.00 children for effort instead of for intelligence to prevent a child

13 misrepresenting their score. (Another way to phrase this is that we would, on average, need to praise 4.00 children for intelligence instead of for effort to find one more child who misrepresents their score). Can we draw cause-and-effect conclusions from this study, i.e., does being praised for intelligence cause a change in misrepresentation of scores? Yes, because children were randomly assigned to get effort versus intelligence praise, so we don t need to worry that the two groups differ with respect to other, confounding variables (beyond what we would expect by random chance). Japanese comics (courtesy of Mia Lewis). A researcher is interested in comparing attributes of boys and girls comics in Japan. She samples the first page of n = 45 boys comics and n = 54 girls comics, and counts the number of first pages in each group that include sparkles. She finds that 12 of the boys comics include sparkles and 43 of the girls comics include sparkles. a. Describe the population value that the researcher is interested in. We want to know the difference in proportion of sparkles in the entire population of boys comics vs. the entire population of girls comics. b. Identify the grouping variable and the response variable The grouping variable is boys vs. girls comics, and the response variable is whether or not there are sparkles c. Construct a 2 x 2 table of frequencies (counts), with the grouping variable in columns. Fill in the marginal frequencies too. BOYS GIRLS TOTAL SPARKLES = 55 NO SPARKLES = = = 44 TOTAL d. Compare the central tendency and variability of the boys and girls comics. The mode for boys comics is no sparkles and the mode for girls comics is sparkles. The boys comics have greater variability than the girls comics (relative frequency at the mode is 33 / 45 =.73 vs. 43 / 54 =.80). e. Calculate the difference in sample proportions of has sparkles, p boys p ĝirls. p boys = 12 / 45 =.27 p ĝirls = 43 / 54 =.80 p boys - p ĝirls = =.-.53 f. State two hypotheses about the difference in population proportions, π boys π girls. Use two-tailed hypotheses. H 0 : = π boys π girls = 0 H A : = π boys π girls 0 g. Use the Two Proportion applet to simulate a distribution of differences in sample proportions that we could observe if there was no relationship between type of comic and presence of sparkles. Use this distribution to compute the probability of observing a sample difference as or more extreme as our sample difference, if

14 there was no relationship between the two variables. (Or describe how you could simulate a single instance of a difference in sample proportions). Exact answer will vary from simulation to simulation with the applet. Please see us in office hours for help using the applet that we used in lecture. The applet simulates the following procedure: you could shuffle 55 red cards and 44 blue cards, and randomly deal them into piles of 45 and 54, and calculate the difference in proportion of red cards in the two piles. h. Use a theory-based method to a. Describe the mean and standard deviation of the distribution of sample proportions that we could observe if there was no relationship between type of comic and presence of sparkles μ p _boys p _girls = 0 σ p _boys p _girls = (π pooled * (1-π pooled )) / n 1 + (π pooled * (1-π pooled )) / n 2 )) calculate, π pooled = 55 / 99 =.56 σ p _boys p _girls = (.56 * (1-.56)) / 45 + (.56 * (1-.56)) / 54)) σ p _boys p _girls =.10 (not asked, but we also assume the distribution is a normal distribution, which holds as long as the distributions of individual group proportions are normally distributed) b. Describe the location of our observed difference in sample proportions (p boys p ĝirls ) in this distribution, using a z-statistic z = ((p boys p ĝirls ) - μ p _boys p _girls ) / σ p _boys p _girls z = ( ) /.10 = c. Use R to calculate the two-tailed p-value for that z-statistic, and test the null hypothesis using a significance level of α =.05. p(z < or z > 5.30) = pnorm(-5.30) * 2 = i. Calculate the measures of effect size of relative risk and number needed to treat, and interpret what they mean in this situation. relative risk = p larger / p ŝmaller =.80 /.27 = Based on our best estimate from our sample proportions, it seems that a girls comic book is 2.96 times as likely to have sparkles than a boys comic book. NNT = 1 / p îboys p ĝirls = 1 / = Based on our best estimate from our sample proportions, we would, on average, need to sample 1.89 girls comic books instead of boys comic books find one more comic books with sparkles. (Another way of saying this is that we would, on average, need to sample 1.89 boys comic books instead of girls comic books to find one fewer comic books with sparkles).

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

CHAPTER 9: HYPOTHESIS TESTING

CHAPTER 9: HYPOTHESIS TESTING CHAPTER 9: HYPOTHESIS TESTING THE SECOND LAST EXAMPLE CLEARLY ILLUSTRATES THAT THERE IS ONE IMPORTANT ISSUE WE NEED TO EXPLORE: IS THERE (IN OUR TWO SAMPLES) SUFFICIENT STATISTICAL EVIDENCE TO CONCLUDE

More information

Lab #12: Exam 3 Review Key

Lab #12: Exam 3 Review Key Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution

More information

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Repeated-Measures ANOVA 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region

More information

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α Chapter 8 Notes Section 8-1 Independent and Dependent Samples Independent samples have no relation to each other. An example would be comparing the costs of vacationing in Florida to the cost of vacationing

More information

Hypothesis testing: Steps

Hypothesis testing: Steps Review for Exam 2 Hypothesis testing: Steps Exam 2 Review 1. Determine appropriate test and hypotheses 2. Use distribution table to find critical statistic value(s) representing rejection region 3. Compute

More information

Hypothesis Testing with Z and T

Hypothesis Testing with Z and T Chapter Eight Hypothesis Testing with Z and T Introduction to Hypothesis Testing P Values Critical Values Within-Participants Designs Between-Participants Designs Hypothesis Testing An alternate hypothesis

More information

Chapter 7 Class Notes Comparison of Two Independent Samples

Chapter 7 Class Notes Comparison of Two Independent Samples Chapter 7 Class Notes Comparison of Two Independent Samples In this chapter, we ll compare means from two independently sampled groups using HTs (hypothesis tests). As noted in Chapter 6, there are two

More information

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning

*Karle Laska s Sections: There is no class tomorrow and Friday! Have a good weekend! Scores will be posted in Compass early Friday morning STATISTICS 100 EXAM 3 Spring 2016 PRINT NAME (Last name) (First name) *NETID CIRCLE SECTION: Laska MWF L1 Laska Tues/Thurs L2 Robin Tu Write answers in appropriate blanks. When no blanks are provided CIRCLE

More information

Inferential statistics

Inferential statistics Inferential statistics Inference involves making a Generalization about a larger group of individuals on the basis of a subset or sample. Ahmed-Refat-ZU Null and alternative hypotheses In hypotheses testing,

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017 Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0

More information

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions Chapter 9 Inferences from Two Samples 9. Inferences About Two Proportions 9.3 Inferences About Two s (Independent) 9.4 Inferences About Two s (Matched Pairs) 9.5 Comparing Variation in Two Samples Objective

More information

Practice Questions: Statistics W1111, Fall Solutions

Practice Questions: Statistics W1111, Fall Solutions Practice Questions: Statistics W, Fall 9 Solutions Question.. The standard deviation of Z is 89... P(=6) =..3. is definitely inside of a 95% confidence interval for..4. (a) YES (b) YES (c) NO (d) NO Questions

More information

LECTURE 15: SIMPLE LINEAR REGRESSION I

LECTURE 15: SIMPLE LINEAR REGRESSION I David Youngberg BSAD 20 Montgomery College LECTURE 5: SIMPLE LINEAR REGRESSION I I. From Correlation to Regression a. Recall last class when we discussed two basic types of correlation (positive and negative).

More information

10.1. Comparing Two Proportions. Section 10.1

10.1. Comparing Two Proportions. Section 10.1 /6/04 0. Comparing Two Proportions Sectio0. Comparing Two Proportions After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET

More information

DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence interval to compare two proportions.

DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence interval to compare two proportions. Section 0. Comparing Two Proportions Learning Objectives After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence

More information

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

79 Wyner Math Academy I Spring 2016

79 Wyner Math Academy I Spring 2016 79 Wyner Math Academy I Spring 2016 CHAPTER NINE: HYPOTHESIS TESTING Review May 11 Test May 17 Research requires an understanding of underlying mathematical distributions as well as of the research methods

More information

Inferences About Two Proportions

Inferences About Two Proportions Inferences About Two Proportions Quantitative Methods II Plan for Today Sampling two populations Confidence intervals for differences of two proportions Testing the difference of proportions Examples 1

More information

Last few slides from last time

Last few slides from last time Last few slides from last time Example 3: What is the probability that p will fall in a certain range, given p? Flip a coin 50 times. If the coin is fair (p=0.5), what is the probability of getting an

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies The t-test: So Far: Sampling distribution benefit is that even if the original population is not normal, a sampling distribution based on this population will be normal (for sample size > 30). Benefit

More information

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING LECTURE 1 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING INTERVAL ESTIMATION Point estimation of : The inference is a guess of a single value as the value of. No accuracy associated with it. Interval estimation

More information

Chapter 20 Comparing Groups

Chapter 20 Comparing Groups Chapter 20 Comparing Groups Comparing Proportions Example Researchers want to test the effect of a new anti-anxiety medication. In clinical testing, 64 of 200 people taking the medicine reported symptoms

More information

Lecture Stat 302 Introduction to Probability - Slides 5

Lecture Stat 302 Introduction to Probability - Slides 5 Lecture Stat 302 Introduction to Probability - Slides 5 AD Jan. 2010 AD () Jan. 2010 1 / 20 Conditional Probabilities Conditional Probability. Consider an experiment with sample space S. Let E and F be

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2014 January 1, 2017 1 /22 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1 PHP2510: Principles of Biostatistics & Data Analysis Lecture X: Hypothesis testing PHP 2510 Lec 10: Hypothesis testing 1 In previous lectures we have encountered problems of estimating an unknown population

More information

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc.

Chapter 18. Sampling Distribution Models. Copyright 2010, 2007, 2004 Pearson Education, Inc. Chapter 18 Sampling Distribution Models Copyright 2010, 2007, 2004 Pearson Education, Inc. Normal Model When we talk about one data value and the Normal model we used the notation: N(μ, σ) Copyright 2010,

More information

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment

Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem. Motivation: Designing a new silver coins experiment Stat 529 (Winter 2011) Experimental Design for the Two-Sample Problem Reading: 2.4 2.6. Motivation: Designing a new silver coins experiment Sample size calculations Margin of error for the pooled two sample

More information

Data Presentation. Naureen Ghani. May 4, 2018

Data Presentation. Naureen Ghani. May 4, 2018 Data Presentation Naureen Ghani May 4, 2018 Data is only as good as how it is presented. How do you take hundreds or thousands of data points and create something a human can understand? This is a problem

More information

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1 BINF 702 SPRING 2014 Chapter 8 Hypothesis Testing: Two-Sample Inference Two- Sample Inference 1 A Poster Child for two-sample hypothesis testing Ex 8.1 Obstetrics In the birthweight data in Example 7.2,

More information

AP Statistics Ch 12 Inference for Proportions

AP Statistics Ch 12 Inference for Proportions Ch 12.1 Inference for a Population Proportion Conditions for Inference The statistic that estimates the parameter p (population proportion) is the sample proportion p ˆ. p ˆ = Count of successes in the

More information

The Difference in Proportions Test

The Difference in Proportions Test Overview The Difference in Proportions Test Dr Tom Ilvento Department of Food and Resource Economics A Difference of Proportions test is based on large sample only Same strategy as for the mean We calculate

More information

STT 315 Problem Set #3

STT 315 Problem Set #3 1. A student is asked to calculate the probability that x = 3.5 when x is chosen from a normal distribution with the following parameters: mean=3, sd=5. To calculate the answer, he uses this command: >

More information

Psych 230. Psychological Measurement and Statistics

Psych 230. Psychological Measurement and Statistics Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009 This Time. Non-Parametric statistics Chi-Square test One-way Two-way Statistical Testing 1. Decide which test to use 2. State

More information

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name

EXAM # 2. Total 100. Please show all work! Problem Points Grade. STAT 301, Spring 2013 Name STAT 301, Spring 2013 Name Lec 1, MWF 9:55 - Ismor Fischer Discussion Section: Please circle one! TA: Shixue Li...... 311 (M 4:35) / 312 (M 12:05) / 315 (T 4:00) Xinyu Song... 313 (M 2:25) / 316 (T 12:05)

More information

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement

Occupy movement - Duke edition. Lecture 14: Large sample inference for proportions. Exploratory analysis. Another poll on the movement Occupy movement - Duke edition Lecture 14: Large sample inference for proportions Statistics 101 Mine Çetinkaya-Rundel October 20, 2011 On Tuesday we asked you about how closely you re following the news

More information

Sociology 593 Exam 2 Answer Key March 28, 2002

Sociology 593 Exam 2 Answer Key March 28, 2002 Sociology 59 Exam Answer Key March 8, 00 I. True-False. (0 points) Indicate whether the following statements are true or false. If false, briefly explain why.. A variable is called CATHOLIC. This probably

More information

Outline. Probability. Math 143. Department of Mathematics and Statistics Calvin College. Spring 2010

Outline. Probability. Math 143. Department of Mathematics and Statistics Calvin College. Spring 2010 Outline Math 143 Department of Mathematics and Statistics Calvin College Spring 2010 Outline Outline 1 Review Basics Random Variables Mean, Variance and Standard Deviation of Random Variables 2 More Review

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Manipulating Radicals

Manipulating Radicals Lesson 40 Mathematics Assessment Project Formative Assessment Lesson Materials Manipulating Radicals MARS Shell Center University of Nottingham & UC Berkeley Alpha Version Please Note: These materials

More information

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency

Math 120 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency Math 1 Introduction to Statistics Mr. Toner s Lecture Notes 3.1 Measures of Central Tendency The word average: is very ambiguous and can actually refer to the mean, median, mode or midrange. Notation:

More information

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126 Psychology 60 Fall 2013 Practice Final Actual Exam: This Wednesday. Good luck! Name: To view the solutions, check the link at the end of the document. This practice final should supplement your studying;

More information

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room:

STA220H1F Term Test Oct 26, Last Name: First Name: Student #: TA s Name: or Tutorial Room: STA0HF Term Test Oct 6, 005 Last Name: First Name: Student #: TA s Name: or Tutorial Room: Time allowed: hour and 45 minutes. Aids: one sided handwritten aid sheet + non-programmable calculator Statistical

More information

Nicole Dalzell. July 3, 2014

Nicole Dalzell. July 3, 2014 UNIT 2: PROBABILITY AND DISTRIBUTIONS LECTURE 1: PROBABILITY AND CONDITIONAL PROBABILITY STATISTICS 101 Nicole Dalzell July 3, 2014 Announcements No team activities today Labs: Individual Write-Ups Statistics

More information

BINOMIAL DISTRIBUTION

BINOMIAL DISTRIBUTION BINOMIAL DISTRIBUTION The binomial distribution is a particular type of discrete pmf. It describes random variables which satisfy the following conditions: 1 You perform n identical experiments (called

More information

Quantitative Understanding in Biology 1.7 Bayesian Methods

Quantitative Understanding in Biology 1.7 Bayesian Methods Quantitative Understanding in Biology 1.7 Bayesian Methods Jason Banfelder October 25th, 2018 1 Introduction So far, most of the methods we ve looked at fall under the heading of classical, or frequentist

More information

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias Recap Announcements Lecture 5: Statistics 101 Mine Çetinkaya-Rundel September 13, 2011 HW1 due TA hours Thursday - Sunday 4pm - 9pm at Old Chem 211A If you added the class last week please make sure to

More information

STAT 31 Practice Midterm 2 Fall, 2005

STAT 31 Practice Midterm 2 Fall, 2005 STAT 31 Practice Midterm 2 Fall, 2005 INSTRUCTIONS: BOTH THE BUBBLE SHEET AND THE EXAM WILL BE COLLECTED. YOU MUST PRINT YOUR NAME AND SIGN THE HONOR PLEDGE ON THE BUBBLE SHEET. YOU MUST BUBBLE-IN YOUR

More information

Sampling Distributions

Sampling Distributions Sampling Distributions Sampling Distribution of the Mean & Hypothesis Testing Remember sampling? Sampling Part 1 of definition Selecting a subset of the population to create a sample Generally random sampling

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics Math 140 Introductory Statistics Professor Silvia Fernández Lecture 8 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. 5.1 Models of Random Behavior Outcome: Result or answer

More information

STAT Chapter 8: Hypothesis Tests

STAT Chapter 8: Hypothesis Tests STAT 515 -- Chapter 8: Hypothesis Tests CIs are possibly the most useful forms of inference because they give a range of reasonable values for a parameter. But sometimes we want to know whether one particular

More information

STAT 201 Chapter 5. Probability

STAT 201 Chapter 5. Probability STAT 201 Chapter 5 Probability 1 2 Introduction to Probability Probability The way we quantify uncertainty. Subjective Probability A probability derived from an individual's personal judgment about whether

More information

Do students sleep the recommended 8 hours a night on average?

Do students sleep the recommended 8 hours a night on average? BIEB100. Professor Rifkin. Notes on Section 2.2, lecture of 27 January 2014. Do students sleep the recommended 8 hours a night on average? We first set up our null and alternative hypotheses: H0: μ= 8

More information

Probability Year 10. Terminology

Probability Year 10. Terminology Probability Year 10 Terminology Probability measures the chance something happens. Formally, we say it measures how likely is the outcome of an event. We write P(result) as a shorthand. An event is some

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Section 9.1 (Part 2) (pp ) Type I and Type II Errors

Section 9.1 (Part 2) (pp ) Type I and Type II Errors Section 9.1 (Part 2) (pp. 547-551) Type I and Type II Errors Because we are basing our conclusion in a significance test on sample data, there is always a chance that our conclusions will be in error.

More information

Dynamics in Social Networks and Causality

Dynamics in Social Networks and Causality Web Science & Technologies University of Koblenz Landau, Germany Dynamics in Social Networks and Causality JProf. Dr. University Koblenz Landau GESIS Leibniz Institute for the Social Sciences Last Time:

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Statistics 100 Exam 2 March 8, 2017

Statistics 100 Exam 2 March 8, 2017 STAT 100 EXAM 2 Spring 2017 (This page is worth 1 point. Graded on writing your name and net id clearly and circling section.) PRINT NAME (Last name) (First name) net ID CIRCLE SECTION please! L1 (MWF

More information

CONTINUOUS RANDOM VARIABLES

CONTINUOUS RANDOM VARIABLES the Further Mathematics network www.fmnetwork.org.uk V 07 REVISION SHEET STATISTICS (AQA) CONTINUOUS RANDOM VARIABLES The main ideas are: Properties of Continuous Random Variables Mean, Median and Mode

More information

5.3 Conditional Probability and Independence

5.3 Conditional Probability and Independence 28 CHAPTER 5. PROBABILITY 5. Conditional Probability and Independence 5.. Conditional Probability Two cubical dice each have a triangle painted on one side, a circle painted on two sides and a square painted

More information

Chapter 10: Comparing Two Populations or Groups

Chapter 10: Comparing Two Populations or Groups Chapter 10: Comparing Two Populations or Groups Sectio0.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 10 Comparing Two Populations or Groups 10.1 10.2 Comparing Two Means

More information

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies. I. T or F. (1 points each) 1. The χ -distribution is symmetric. F. The χ may be negative, zero, or positive F 3. The chi-square distribution is skewed to the right. T 4. The observed frequency of a cell

More information

20 Hypothesis Testing, Part I

20 Hypothesis Testing, Part I 20 Hypothesis Testing, Part I Bob has told Alice that the average hourly rate for a lawyer in Virginia is $200 with a standard deviation of $50, but Alice wants to test this claim. If Bob is right, she

More information

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n = Hypothesis testing I I. What is hypothesis testing? [Note we re temporarily bouncing around in the book a lot! Things will settle down again in a week or so] - Exactly what it says. We develop a hypothesis,

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Statistical Inference. Section 9.1 Significance Tests: The Basics. Significance Test. The Reasoning of Significance Tests.

Statistical Inference. Section 9.1 Significance Tests: The Basics. Significance Test. The Reasoning of Significance Tests. Section 9.1 Significance Tests: The Basics Significance Test A significance test is a formal procedure for comparing observed data with a claim (also called a hypothesis) whose truth we want to assess.

More information

Chapter Three. Hypothesis Testing

Chapter Three. Hypothesis Testing 3.1 Introduction The final phase of analyzing data is to make a decision concerning a set of choices or options. Should I invest in stocks or bonds? Should a new product be marketed? Are my products being

More information

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam.

AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam. AP Statistics Semester I Examination Section I Questions 1-30 Spend approximately 60 minutes on this part of the exam. Name: Directions: The questions or incomplete statements below are each followed by

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek

More information

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E Salt Lake Community College MATH 1040 Final Exam Fall Semester 011 Form E Name Instructor Time Limit: 10 minutes Any hand-held calculator may be used. Computers, cell phones, or other communication devices

More information

Single Sample Means. SOCY601 Alan Neustadtl

Single Sample Means. SOCY601 Alan Neustadtl Single Sample Means SOCY601 Alan Neustadtl The Central Limit Theorem If we have a population measured by a variable with a mean µ and a standard deviation σ, and if all possible random samples of size

More information

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals Past weeks: Measures of central tendency (mean, mode, median) Measures of dispersion (standard deviation, variance, range, etc). Working with the normal curve Last week: Sample, population and sampling

More information

CHAPTER 10 Comparing Two Populations or Groups

CHAPTER 10 Comparing Two Populations or Groups CHAPTER 10 Comparing Two Populations or Groups 10.1 Comparing Two Proportions The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers Comparing Two Proportions

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Math 140 Introductory Statistics

Math 140 Introductory Statistics 5. Models of Random Behavior Math 40 Introductory Statistics Professor Silvia Fernández Chapter 5 Based on the book Statistics in Action by A. Watkins, R. Scheaffer, and G. Cobb. Outcome: Result or answer

More information

Let us think of the situation as having a 50 sided fair die; any one number is equally likely to appear.

Let us think of the situation as having a 50 sided fair die; any one number is equally likely to appear. Probability_Homework Answers. Let the sample space consist of the integers through. {, 2, 3,, }. Consider the following events from that Sample Space. Event A: {a number is a multiple of 5 5, 0, 5,, }

More information

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7

ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 ECO375 Tutorial 4 Wooldridge: Chapter 6 and 7 Matt Tudball University of Toronto St. George October 6, 2017 Matt Tudball (University of Toronto) ECO375H1 October 6, 2017 1 / 36 ECO375 Tutorial 4 Welcome

More information

Chapter 10: Comparing Two Populations or Groups

Chapter 10: Comparing Two Populations or Groups Chapter 10: Comparing Two Populations or Groups Sectio0.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 10 Comparing Two Populations or Groups 10.1 10.2 Comparing Two Means

More information

Rolling marble lab. B. Pre-Lab Questions a) When an object is moving down a ramp, is its speed increasing, decreasing, or staying the same?

Rolling marble lab. B. Pre-Lab Questions a) When an object is moving down a ramp, is its speed increasing, decreasing, or staying the same? IP 614 Rolling marble lab Name: Block: Date: A. Purpose In this lab you are going to see, first hand, what acceleration means. You will learn to describe such motion and its velocity. How does the position

More information

Samples and Populations Confidence Intervals Hypotheses One-sided vs. two-sided Statistical Significance Error Types. Statistiek I.

Samples and Populations Confidence Intervals Hypotheses One-sided vs. two-sided Statistical Significance Error Types. Statistiek I. Statistiek I Sampling John Nerbonne CLCG, Rijksuniversiteit Groningen http://www.let.rug.nl/nerbonne/teach/statistiek-i/ John Nerbonne 1/41 Overview 1 Samples and Populations 2 Confidence Intervals 3 Hypotheses

More information

Introduction to Statistics

Introduction to Statistics MTH4106 Introduction to Statistics Notes 6 Spring 2013 Testing Hypotheses about a Proportion Example Pete s Pizza Palace offers a choice of three toppings. Pete has noticed that rather few customers ask

More information

Inferences for Correlation

Inferences for Correlation Inferences for Correlation Quantitative Methods II Plan for Today Recall: correlation coefficient Bivariate normal distributions Hypotheses testing for population correlation Confidence intervals for population

More information

Chapter 10: Comparing Two Populations or Groups

Chapter 10: Comparing Two Populations or Groups Chapter 10: Comparing Two Populations or Groups Sectio0.1 The Practice of Statistics, 4 th edition For AP* STARNES, YATES, MOORE Chapter 10 Comparing Two Populations or Groups 10.1 10. Comparing Two Means

More information

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing Page Title PSY 305 Module 3 Introduction to Hypothesis Testing Z-tests Five steps in hypothesis testing State the research and null hypothesis Determine characteristics of comparison distribution Five

More information

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000.

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000. Two sided, two sample t-tests. I. Brief review: 1) We are interested in how a sample compares to some pre-conceived notion. For example: a) IQ = 100 b) Average height for men = 5 10. c) Average number

More information

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides Chapter 7 Inference for Distributions Introduction to the Practice of STATISTICS SEVENTH EDITION Moore / McCabe / Craig Lecture Presentation Slides Chapter 7 Inference for Distributions 7.1 Inference for

More information

Difference Between Pair Differences v. 2 Samples

Difference Between Pair Differences v. 2 Samples 1 Sectio1.1 Comparing Two Proportions Learning Objectives After this section, you should be able to DETERMINE whether the conditions for performing inference are met. CONSTRUCT and INTERPRET a confidence

More information

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals

Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals Hypothesis Testing and Confidence Intervals (Part 2): Cohen s d, Logic of Testing, and Confidence Intervals Lecture 9 Justin Kern April 9, 2018 Measuring Effect Size: Cohen s d Simply finding whether a

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information