T- test recap. Week 7. One- sample t- test. One- sample t- test 5/13/12. t = x " µ s x. One- sample t- test Paired t- test Independent samples t- test

T- test recap Week 7 One- sample t- test Paired t- test Independent samples t- test T- test review Addi5onal tests of significance: correla5ons, qualita5ve data In each case, we re looking to see whether the mean that we observe in our sample is greater than the mean of the sampling distribu5on of the null hypothesis Participant Score Subject 01 1 Subject 0 0.5 Subject 03 0.5 Subject 04 0.75 Subject 05 0.75 Subject 06 0.5 Subject 07 0.75 Subject 08 1 Subject 09 1 Subject 10 0.5 Subject 11 1 Subject 1 1 Subject 13 1 Subject 14 1 Subject 15 0.5 Subject 16 0.75 Subject 17 1 Subject 18 0.75 Subject 19 0.5 Mean:.763 StDev:.43 One- sample t- test x =.763 m =.500 s x =.43/ 19 Children: recogni5on of words, given two picture alterna5ves Ques5on: Is this beler than chance performance? H 0 : Chance performance is.5 (50%) H 1 : Actual performance is not.5 N = 19 par5cipants df = 18 t = x " µ s x Look up cri5cal t:.101.763 -.500 t =.0557 = 4.7 We reject the null hypothesis. In R: One- sample t- test > e1 = c(1.00, 0.50, 0.5, 0.75, 0.75, 0.50, 0.75, 1.00, 1.00, 0.50, 1.00, 1.00, 1.00, 1.00, 0.50, 0.75, 1.00, 0.75, 0.50) > t.test(e1,mu=.5) Report: Par5cipants scored an average of 76% correct (SD = 4%), which exceeded chance performance (t(18) = 4.73, p =.000, d = 1.08). 1

Paired t- test Paired t- test Participant Score1 Score Difference Subject 01 0.5 1 0.5 Subject 0 0.5 0.5 0 Subject 03 0.5 0.5-0.5 Subject 04 0.75 0.5 0.5 Subject 05 0.75 0 0.75 Subject 06 0.5 0.5 0 Subject 07 0.75 0.5 0.5 Subject 08 1 0.5 0.5 Subject 09 1 1 0 Subject 10 0.5 0.5 0.5 Subject 11 1 0.75 0.5 Subject 1 1 0.5 0.5 Subject 13 1 0.5 0.75 Subject 14 1 0.5 0.5 Subject 15 0.5 0.5 0 Subject 16 0.75 0.5 0.5 Subject 17 1 1 0 Subject 18 0.75 0 0.75 Subject 19 0.5 0.5 0 Mean D:.76 StDev:.99.99/sqrt(19) =.069 Kids were given a second, harder test. Ques5on: Did kids get worse on the second test? Note that observa5ons are paired H 0 : Difference is 0 H 1 : Actual performance is not 0 Cri5cal t? S5ll.101 D - µ.76 0 s D t = =.069 = 4.0 We reject the null hypothesis. In R: > e1 = c(1.00, 0.50, 0.5, 0.75, 0.75, 0.50, 0.75, 1.00, 1.00, 0.50, 1.00, 1.00, 1.00, 1.00, 0.50, 0.75, 1.00, 0.75, 0.50) > e1b = c(0.50, 0.50, 0.50, 0.50, 0.00, 0.50, 0.50, 0.50, 1.00, 0.5, 0.75, 0.50, 0.5, 0.50, 0.50, 0.50, 1.00, 0.00, 0.50) > t.test(e1, e1b, paired=true) Report: Par5cipants performance dropped 7.6 percentage points from Test 1 to Test (SD = 9.9%), which represented a significant decline (t(18) = 4.0, p =.0008, d =.93). Participant Score Subject 01 1 Subject 0 0.5 Subject 03 0.5 Subject 04 0.75 Subject 05 0.75 Subject 06 0.5 Subject 07 0.75 Subject 08 1 Subject 09 1 Subject 10 0.5 Subject 11 1 Subject 1 1 Subject 13 1 Subject 14 1 Subject 15 0.5 Subject 16 0.75 Subject 17 1 Subject 18 0.75 Subject 19 0.5 Subject 0 0.65 Subject 1 0.75 Subject 0.65 Subject 3 1 Subject 4 0.375 Subject 5 0.75 Subject 6 0.75 Subject 7 0.65 Subject 8 0.65 Subject 9 0.65 Subject 30 0.375 Subject 31 0.65 Subject 3 0.875 Subject 33 0.65 Subject 34 0.5 Subject 35 0.75 Subject 36 0.65 Subject 37 0.5 Subject 38 0.65 Independent samples t- test Group 1 Group Mean:.763 StDev:.43 ^ (n 1-1)σ^ ^ 1 + (n - 1)σ s P = σ P = n 1 + n - S X1- X = (s P /n 1 + s P /n ) A new group of kids was tested. Ques5on: Did the first group of kids do beler or worse than the new group of kids? Note: no way to pair these observa<ons!! H 0 : Difference is 0 H 1 : Actual performance is not 0 Cri5cal t? For df = 38 = 36:.04 x 1 x.763.645.0657 Mean:.645 StDev:.15 t = S x1- x = = 1.80 We fail to reject the null hypothesis. So this is just a hifalu5n way of sta5ng the SS. ^ (n 1-1)σ^ 1 + (n - 1)σ^ s P = σ P = n 1 + n - One- sample variance (s ): Variance Paired variance (s D ): Independent- samples variance (s P ): Pooled variance This is just the formula for using the two sample SDs to get the pooled SD. But what does it mean? (SS=sum of squared devia5ons from the mean) SS n- 1 SS D df = SS 1 + SS df 1 + df SS df Standard error of the mean s n s D n + s 1 s n 1 n = s n s D = n s P n 1 = + s P n

Independent samples t- test Decide which type of t- test is appropriate for the following studies. In R: > e1 = c(1.00, 0.50, 0.5, 0.75, 0.75, 0.50, 0.75, 1.00, 1.00, 0.50, 1.00, 1.00, 1.00, 1.00, 0.50, 0.75, 1.00, 0.75, 0.50) > e = c(0.65, 0.750, 0.65, 1.000, 0.375, 0.750, 0.750, 0.65, 0.65, 0.65, 0.375, 0.65, 0.875, 0.65, 0.500, 0.750, 0.65, 0.500, 0.65) > t.test(e1,e,paired=false,var.equal=true) a) College students are randomly assigned to receive either behavioral or cogni5ve therapy. Aoer 0 therapeu5c sessions, each student earns a score on a mental health ques5onnaire. b) A researcher wishes to determine whether alendance at a day- care center increases the scores of 3- year- old children on a motor skill test. Random assignment dictates which twin from each pair of 0 twins alends the day- care center and which twin stays at home. Report: Scores in Group 1 (M = 76%, SD = 4%) and scores in Group (M = 65%, SD = 15%) did not differ significantly from each other (t(36) = 1.80, p =.08, d =.60). Decide which type of t- test is appropriate for the following studies. Decide which type of t- test is appropriate for the following studies. c) 100 college freshmen are randomly assigned to sophomore roommates who have either similar or dissimilar voca5onal goals. At the end of their freshman year, the mean GPAs of these two groups are to be analyzed. d) According to the US Department of Health, the average 16- year- old male can do 3 pushups. A PE instructor finds that in his school district, 30 randomly selected 16- year- old- males can do an average of 8 pushups. e) A child psychologist assigns aggression scores to each of ten children during two 60- minute observa5on periods separated by an intervening exposure to a series of violent TV cartoons. 3

Some more tests you should know Correlated Variables? Name Cards Sent Cards Received Doris 13 14 Steve 9 18 Mike 7 1 Andrea 5 10 John 1 6 pg. 19 Linear Correla5on: Test of the null hypothesis that data come from two uncorrelated normal distribu5ons r=.80 y = a + bx t = r " # hyp 1" r n " = r 1" r n " n is the Number of Sample Pairs. Degrees of Freedom=n- pg. 130 4

Repor5ng your Results The number of cards sent did tend to correlate with the number of cards received (r=.80). However, the degree of correlation failed to reach significance (t(3)=.31, p>.05). Reminder: Beware spurious r s Ar5ficial data produced by Francis Anscombe. For all data sets, r=.81 5

1 Independent Random Variables Two random variables are independent if the value taken by one variable tells you nothing about the value taken by the other variable. P(X=x,Y=y)=P(X=x)P(Y=y) First Flip Second Flip P(X=h, Y=t)=P(X=h)P(Y=t)=(1/)*(1/)=1/4 Uncorrelated Independent 0-1 0 1-1 1 0 Free Online Hypothesis Tests & Stats Tools Assume a Fair Coin: http://faculty.vassar.edu/lowry/vassarstats.html" Decide Fair Probability Decide Cheat # of Heads 6

Assume P(x>4)=P(x<4)=.5 Null hypothesis: P(x>4)=P(x<4) Retain Null Probability Reject Null Participant Rating (x) x>4 #1 6 1 # 1 0 #3 5 1 #4 5 1 #5 7 1 # of Ra5ngs above 4 p=.19, Retain H0 Binomial (Sign) Test Useful for determining if the median of a sample differs from an a priori value. If a sample equals the median, throw it out. The Wilcoxon T test can also be used but it makes the addi5onal assump5on that the samples come from a symmetric distribu5on. Useful for nominal data that can take only one of two values (e.g., heads or tails) Normally distributed or large n z-test one sample t-test repeated measures t-test Non-normally distributed, small n (or Wilcoxon T) (or Wilcoxon T)? independent samples t-test Mann-Whitney U test 7

Aphasia Therapy Aphasia Therapy A therapist devises a series of motor exercises for the leo hand that he thinks will improve the ability of aphasic pa5ents with leo hemisphere damage to beler use their right hemisphere to compensate. He tests the language comprehension ability of eight pa5ents, before and aoer doing the motor exercises for a week. 1 33% 36% +3% 7% 74% +% 3 54% 55% +1% 4 5% 11% +6% 5 90% 71% -19% 6 11% 15% +4% 7 44% 46% +% 8 3% 4% +1% H 0 : P(Improve)=P(Get Worse)=.5 Binomial (aka. Sign) Test Retain Null Probability Reject Null (pa5ents improved) p=.035 Reject H0 In this class, you will NOT be expected to execute a by hand! But you 1 should know when it s appropriate to use. # of Improved Pa5ents 8

What about Correla5on? z-test one sample t-test repeated measures t-test independent samples t-test (or Wilcoxon T) (or Wilcoxon T) (or Wilcoxon T) Mann-Whitney U test Ar5ficial data produced by Francis Anscombe. For all data sets, r=.81 Quan5ta5ve & Ordinal Data: Rank Correla5on Kendall s tau Spearman s rho Cards Sent (Rank) Doris 13 14 Steve 9 18 Cards Received (Rank) Like Pearson s r, tau and rho range from - 1 to 1, with the sign of the correla5on indica5ng the direc5on of the correla5on and the magnitude indica5ng the strength of the correla5on. Mike 7 1 Andrea 5 10 John 1 6 9

Rank Correla5on Popular rank correla5ons: Kendall s tau Spearman s rho Pros: More general than linear correla5on (sensi5ve to any monotonic rela5onship) Less sensi5ve to outliers than Pearson s r Useful for ordinal data Rank Correla5on Popular rank correla5ons: Kendall s tau Spearman s rho Cons: Less powerful than linear correla5on if the data really are linearly correlated Inaccurate if lots of 5ed ranks Rank Correla5on In this class, you will NOT be expected to do rank correla5on! But you should know 1 when it s appropriate to use. Summary: Nonparametric Tests (Chapter 0) Make minimal assump5ons about the distribu5on of the null hypothesis Generally less powerful than parametric tests, if the parametric test assump5ons are valid More accurate than parametric tests, it the parametric test assump5ons are invalid. Also useful for non- nominal/ra5o data 11

Empirical Loop Inferen5al Sta5s5cs Descrip5ve Sta5s5cs Collect Data Research Design Hypothesis Tes5ng Es5ma5on Inferen5al Sta5s5cs Hypothesis z- test one samp. t- test ind. samp. t- test rep. meas. t- test Mann- Whitney U confidence intervals around es5mates of means and mean differences Parametric Hypothesis Tests Non- Parametric Hypothesis Tests µ = x," = y µ = x µ D = x y = a + bx Where x and y are constants (e.g., x=0, y=1). z-test one sample t-test repeated measures t-test independent samples t-test median=x median difference between pairs of observations=x two independent samples come from same populations (or Wilcoxon T) (or Wilcoxon T) Mann-Whitney U Where x is some constant (e.g., x=0). 1

Mul5ple tests Your alpha level assumes you re doing just a single test. If you do mul5ple tests, your likelihood of a false posi5ve goes up. Table 16.3: Aggression Scores aoer Sleep Depriva5on 0 3 6 4 6 8 6 10 Mean: 5 8 Three independent samples t- tests Assume all data come from normal distribu5ons. 13

Testwise alpha level: The probability of making a Type I error in a single hypothesis test. "[PT] Familywise alpha level: The probability of making at least one Type I error in one of a family of tests performed on a set of data. "[PF] > "[PT] If multiple tests in the family. 14

Testwise alpha level: The probability of making a Type I error in a single hypothesis test. When you do mul5ple "[PT] tests at an alpha level of x, the probability of making at least Familywise one Type alpha I error level: is The greater probability than of making x! at least one Type I error in one of a family of tests performed on a set of data. "[PF] > "[PT] If multiple tests in the family. Mul5ple Comparisons Problema5c: Rightmost plot assumes the outcome of each test is independent of the others Use a more conserva5ve alpha level Sidák Correc5on for Mul5ple Comparisons "[PT] =1- (1- "[PF]) 1/n where n = the number of comparisons "[PF] =.05, n = 4 "[PT] =1- (1-.05) 1/4 4 =1#.95 $1#.987 =.013 Bonferroni Correc5on is a more conserva5ve alterna5ve 15