COGS 14B: INTRODUCTION TO STATISTICAL ANALYSIS

COGS 14B: INTRODUCTION TO STATISTICAL ANALYSIS TA: Sai Chowdary Gullapally scgullap@eng.ucsd.edu Office Hours: Thursday (Mandeville) 3:30PM - 4:30PM (or by appointment) Slides: I am using the amazing slides made by Joey :) for most of the questions. Feel free to mail anytime regarding doubts.

Hypothesis Testing: What are we actually doing? What is the intuition?

A not so different scenario: Let us assume we have a variable x (could be anything like length, heights etc) Further assume that we know x has one of the following probability densities given below, but unfortunately we do not know which one is the correct one. So what is the best thing we can do to guess? TAKE A SAMPLE AND SEE!!!! And then??

Type I Error: Type II Error: Sampling Distributions H 0 is true, but we reject it (probability set by α) H 1 is true, but we retain H 0 (probability given by β!) hypothesized True β Type II error α

Type I Error: Type II Error: Sampling Distributions H 0 is true, but we reject it (probability given by α) H 1 is true, but we retain H 0 (probability given by β more later!) hypothesized True Type II error Power: Probability of detecting a particular effect (i.e., rejecting H 0 when it is false). β α Alternative Hypothesis H 1 POWER

Name some factors that affect the power of a statistical test. Which of these factors does the experimenter have control over?

Hypothesis Testing Summary (so far) Z test t test (one sample) t test (2 independent samples) Use when: Statistical hypotheses (two-tailed) Test statistic You want to compare a sample mean to the mean of a population with a known standard deviation You want to compare one sample mean to the (hypothesized) mean of a population with an unknown standard deviation You want to compare means of two independent samples taken from different populations (with unknown standard deviations) H 0 : μ = μ 0 H 0 : μ = μ 0 H 0 : μ 1 μ 2 = 0 H 1 : μ μ 0 H 1 : μ μ 0 H 1 : μ 1 μ 2 0 z = t = Distribution z distribution (standard normal) t distribution with n-1 degrees of freedom t distribution with n 1 + n 2 2 degrees of freedom

A random sample of college seniors received special training on how to take the GRE. After analyzing their scores, an investigator reported a dramatic gain relative to the national average of 500, as indicated by a 95% confidence interval of 507 to 527. Are the following interpretations true or false? a) About 95% of all subjects scored between 507 and 527. b) The interval from 507 to 527 refers to a set of possible values of the population mean for all students who undergo special training. True False c) The true population mean is definitely between 507 and 527. False d) This particular interval contains the population mean about 95% of the time. False e) In practice, we never really know whether the interval from 507 to 527 is true or false. True (f) We can be reasonably confident that the population mean is between 507 and 527. True

Hypothesis Testing Summary (so far) Use when: Statistical hypotheses (two tailed) Test statistic T-test (repeated measures) You want to compare means between two groups in a repeated measures design (or paired design) H 0 : μ D = 0 H 1 : μ D 0 Distribution t distribution with n 1 degrees of freedom (n is # of pairs)

The following the are the test performances of the same group of 6 people, obtained while using caffeine and without it. Does Caffeine increase test performance?

The following the are the test performances of the same group of 6 people, obtained while using caffeine and without it. Does Caffeine increase test performance? H 0 : μ D = 0 H 1 : μ D > 0 Degrees of Freedom: 6-1=5 t critical =2.015 t=4.110 Reject H 0

Hypothesis Testing Summary (so far) Use when: Statistical hypotheses Test statistic ANOVA (one-way, between subjects) You want to test whether there are any differences between multiple (>2) population means H 0 : μ 0 = μ 1 = μ 2 = = μ n H 1 : H 0 is false Why not multiple T-tests? Distribution F distribution (df between and df within ) Can you use ANOVA to test a directional hypothesis?

What is SS total, what is SS within, what is SS between, how are they related? (a) Group 1 Group 2 Group 3 3 2 1 5 3 4 5 6 7

Textbook 17.6 A psychologist tests whether shy college students initiate more eye contacts with strangers because of training sessions in assertive behavior. Assume the 8 subjects, coded A, B,, G, H are tested repeatedly after zero, one, two, and three training sessions. The results are expressed as the number of eye contacts: Subj. Zero One Two Three A 1 2 4 7 B 0 1 2 6 C 0 2 3 6 D 2 4 6 7 E 3 4 7 9 F 4 6 8 10 G 2 3 5 8 H 1 3 5 7 T subj 14 T group 13 25 40 60 G = 138 9 11 19 23 28 18 16 Given: SS between = 154.12 SS within = 132.75 SS total = 286.87

Source SS df MS F Between 154.12 3 51.37 16.62 Within 132.75 28 - - Subject 67.87 7 - - Error 64.88 21 3.09 Total 286.87 31 - Reject H 0 Trainings have an effect on number of eye contacts initiated

Source SS df MS F Between 154.12 3 51.37 16.62 Within 132.75 28 - - Subject 67.87 7 - - Error 64.88 21 3.09 Total 286.87 31 - Effect size:

Textbook 17.6 A psychologist tests whether shy college students initiate more eye contacts with strangers because of training sessions in assertive behavior. Assume the 8 subjects, coded A, B,, G, H are tested repeatedly after zero, one, two, and three training sessions. The results are expressed as the number of eye contacts: Multiple Comparisons: Subj. Zero One Two Three A 1 2 4 7 B 0 1 2 6 C 0 2 3 6 D 2 4 6 7 E 3 4 7 9 F 4 6 8 10 G 2 3 5 8 X H 1 3 5 7 group 1.625 3.125 5 7.5 So Group 3 is different from Groups 0, 1, 2. Group 2 is different from Group 0.

ANOVA process:

Textbook 17.6 A psychologist tests whether shy college students initiate more eye contacts with strangers because of training sessions in assertive behavior. Assume the 8 subjects, coded A, B,, G, H are tested repeatedly after zero, one, two, and three training sessions. The results are expressed as the number of eye contacts: Subj. Zero One Two Three A 1 2 4 7 B 0 1 2 6 C 0 2 3 6 D 2 4 6 7 E 3 4 7 9 F 4 6 8 10 G 2 3 5 8 H 1 3 5 7 X group 1.625 3.125 5 7.5 Effect size: Example: Size of effect between 0 trainings and 3 trainings. Could also do effect size between 0 and 2 trainings, 1 and 3 trainings, etc

Things to look out for: Alternate hypothesis in Anova Confidence interval is always taken as per two tailed test table Calculating HSD(remember to take the proper degrees of freedom for q)

A few more solved questions:

An investigator wishes to determine whether alcohol consumption causes a deterioration in the performance of automobile drivers. Before the driving test, subjects drink a glass of orange juice, which, in the case of the treatment group, is laced with two ounces of vodka. Performance is measured by the number of errors made on a driving simulator. A total of 120 subjects are randomly assigned, in equal numbers, to the two groups. For subjects in the treatment group, the mean number of errors equals 26.4, and for subjects in the control group, the mean number of errors equals 18.6. The estimated standard error equals 2.4. H 0 : μ T μ C 0 H 1 : μ T μ C > 0 Decision rule: Reject H 0 at the 0.05 level of significance, if t 1.671 with 118 degrees of freedom. Interpretation: t test for two independent samples! (directional or non-directional?) = (26.4 18.6) 0 2.4 = 3.25 Decision: Alcohol consumption causes an increase in mean performance errors in a driving simulator. Reject H 0.

An investigator wishes to determine whether alcohol consumption causes a deterioration in the performance of automobile drivers. Before the driving test, subjects drink a glass of orange juice, which, in the case of the treatment group, is laces with two ounces of vodka. Performance is measured by the number of errors made on a driving simulator. A total of 120 subjects are randomly assigned, in equal numbers, to the two groups. For subjects in the treatment group, the mean number of errors equals 26.4, and for subjects in the control group, the mean number of errors equals 18.6. The estimated standard error equals 2.4. Specify the p-value for this test result. p < 0.001 Calculate a 95% confidence interval for the true population mean difference and interpret this interval. (X T X C ) ± t conf (s x1 x2 ) [3, 12.6] Use Cohen s d to estimate the effect size, given that the pooled standard deviation s p equals 13.15. Is this a large, medium, or small effect? X 1 X 2 26.4 18.6 d = 2 s p = 13.15 = 0.59 Medium effect

Ex. 1 - A psychologist tests whether a series of workshops on assertive training increases eye contacts initiated by shy college students in controlled interactions with strangers. A total of 32 subjects are randomly assigned, 8 to a group, to attend either 0, 1, 2, or 3 workshop sessions. Use the given information to complete the ANOVA summary table below. SOURCE SS df MS F Between 154.12 Within 132.75 3 28 31 51.37 4.74 10.84 Total 286.87 X X X

Ex. 2 - Students were given different drug treatments before studying for their midterm. Some were given a memory drug, some a placebo drug, and some no treatment. The midterm scores (%) are shown below for the three different groups. At the 0.05 level of significance, do any of the treatments have an effect? X = 83.4 X = 50 X = 16.6 Not all the means are the same treatment has some effect.

Ex. 2 - Students were given different drug treatments before studying for their midterm. Some were given a memory drug, some a placebo drug, and some no treatment. The midterm scores (%) are shown below for the three different groups. At the 0.05 level of significance, do any of the treatments have an effect? Group Totals (T) 417 250 83 Grand total (G) = 750 ANOVA Summary Table SOURCE SS df MS F Between Within 11,155.6 1334.4 2 12 5577.8 111.2 50.16 X Total 12,490 14 X X

Ex. 2 - Students were given different drug treatments before studying for their midterm. Some were given a memory drug, some a placebo drug, and some no treatment. The midterm scores (%) are shown below for the three different groups. At the 0.05 level of significance, do any of the treatments have an effect? X = 83.4 X = 50 X = 16.6 - How strong is the effect? - Which means are different?