Introduction to Statistics in Psychology PSY Professor Greg Francis Lecture 5 Hypothesis testing for two means Why do we let people die? HYPOTHESIS TESTING H : µ = a H a : µ 6= a H : = a H a : 6= a always compare one-sample to a hypothesized population parameter sometimes we want to compare two (or more) population parameters H : µ = µ H a : µ 6= µ TWO-SAMPLE ASE FOR THE MEAN useful when you want to compare means of two groups di erent teaching methods surial with and without drug depression with and without treatment height of males and females the null hypothesis is that there is no di erence between the means H : µ = µ H a : µ 6= µ or another way to say the same thing H : µ µ = H a : µ µ 6= 3 STATISTI since we want to compare the di erence of two population means our statistic should be the di erence of two sample means X X and we will compare that statistic to the hypothesized alue of the parameter H : µ µ = if the statistic is much di erent from the hypothesized parameter, we will reject H same approach as before, di erent sampling distribution SAMPLING DISTRIBUTION We will compute a t test statistic that has a sampling distribution as a t distribution with df = n + n two restrictions. the two samples drawn from the respectie populations are independent. the ariances of the two populations are equal INDEPENDENE drawing a sample with a particular alue of X should not a ect the probability of drawing a sample with any other particular alue of X remember statistical independence P (X and Y )=P (X) P (Y ) same idea here 4 5 6
INDEPENDENE in practice this means we need to be careful about how we sample if comparing treatments, randomly diide a random sample into an experimental group and a control group Thus, een if you hope your new treatment will sae lies, you hae to hae one group of patients without the treatment (maybe een a sham treatment). It seems cruel, but you cannot assume the treatment works, you hae to demonstrate it. take random samples from each population (no oerlap, so no risk of dependence) aoid situations like repeating subjects: e.g. comparing depression for the same subjects before and after treatment (there are ways to test this situation, but not with these techniques) 7 to carry out hypothesis testing we need to calculate standard error to get standard error we need to estimate (or know) the standard deiation since we sample two groups, we need a pooled estimate of to get a pooled estimate we need to be certain that = note this is a statement about the populations we would not expect the sample ariances to be identical 8 HYPOTHESIS TESTING we want to compare population means from two populations H : µ = µ = = we hae Independent samples of size n and n although we draw two random samples (one from each population), we are only interested in one statistic X X but we need to know the sampling distribution for this statistic 9 SAMPLING DISTRIBUTION OF DIFFERENES it turns out that the sampling distribution is familiar. Shape: As sample sizes get large, distribution becomes normal.. entral tendency: The mean of the sampling distribution equals µ µ. 3. Variability: The standard deiation of the sampling distribution (standard error of the di erence between means) is u X X = t B @ We hae to estimate + n n A from our data our estimate is called the pooled estimate because we use scores from both samples FORMULAS deiation formula s = (X i X ) + (X i X ) n + n deiations relatie to the sample mean of each sample! s = raw score form: " X i ( X i ) # " /n + X i ( X i ) # /n n + n X i refers to the ith score from sample X i refers to the ith score from sample n refers to the number of scores in sample n refers to the number of scores in sample FORMULAS ariances s = (n )s +(n )s n + n where s is the ariance among scores in sample s is the ariance among scores in sample
STANDARD ERROR we use the pooled s to calculate an estimate of standard error for the sampling distribution of di erences u s X X = ts B @ + n n this gies us an estimate of the standard deiation of the sampling distribution of the di erence of sample means we need to know one more thing A DEGREES OF FREEDOM we hae two samples with (possibly) di erent numbers of scores the degrees of freedom in sample df = n from sample df = n added together gies the result (depends on independence!) df = n + n (same as in denominator of standard deiation estimate) HYPOTHESIS TESTING now we hae eerything we need to apply the techniques of hypothesis testing. State the hypothesis and the criterion.. ompute the test statistic. 3. ompute the p-alue. 4. Make a decision. 3 4 5 EXAMPLE A neurosurgeon beliees that lesions in a particular area of the brain, called the thalamus, will decrease pain perception. If so, this could be important in the treatment of terminal illness accompanied by intense pain. As a first attempt to test this hypothesis, he conducts an experiment in which 6 rats are randomly diided into two groups of 8 each. Animals in the experimental group receie a small lesion in the part of the thalamus thought to be inoled in pain perception. Animals in the control group receie a comparable lesion in a brain area belieed to be unrelated to pain. Two weeks after surgery each animal is gien a brief electrical shock to the paws. The shock is administered with a ery low intensity leel and increased until the animal first flinches. In this manner, the pain threshold to electric shock is determined for each rat. The following data are obtained. Each score represents the current leel (milliamperes) at which flinching is first obsered. The higher the current leel, the higher is the pain threshold. HYPOTHESIS Step. State the hypotheses and the criterion. Directional hypothesis because we expect the lesion will increase the threshold. H : µ = µ or µ µ = (lesion makes no di erence) H a : µ <µ or µ µ < (lesion increases pain threshold, less sensitiity) we will set =.5 for a one-tailed test We expect a negatie t alue (see H a ) DATA now we consider the data from the experiment the researcher gets the following ontrol Group Experimental Group (False lesion) (Thalamic lesion) X X.8.9.7.8..6.5..4..9.9.4.7..7 6 7 8
OMPUTING TEST STATISTI Step. we hae n =8,n =8 from the data we calculate X =.875 X =.365 X X =.4875 s =.43 (using any formula you want), so that the estimate of standard error is s X X = u ts B @ + A n n s X X = u B t.43 @ 8 + A =.5 8 OMPUTING THE TEST STATISTI Statistic Parameter Test statistic = Standard Error of the Statistic t = t = (X X ) (µ µ ) s X X (.875.365).5 =.49 Step 3. ompute the p-alue. we need to calculate the degrees of freedom df = n + n =6 =4 We use the t Distribution alculator to compute p =.5 INTERPRET RESULTS Step 4. Make a decision. our interpretation of the test is that the di erence between the calculated sample means, or a een bigger di erence, would hae occurred by chance less than 5% of the time if the null hypothesis were true in practice, this means that the study supports the theory that lesions to the thalamus decrease pain perception significant result This means you hae support for the idea that the surgery did a ect pain perception 9 ONFIDENE INTERVAL Basic formula for all confidence interals: I =statistic±(critical alue)(standard error) for a di erence of sample means I =(X X ) ± t c s X X We already hae most of the terms (we get t c from the Inerse t-distribution calculator, so I 95 =(.875.365)±(.448)(.5) I 95 =(.997,.553) ONLINE ALULATOR The calculations are not complicated, but it is usually better to use a computer. You hae to properly format the data. ONLINE ALULATOR You need to understand how to pull out the information you want 3 4
ASSUMPTIONS The t-test that we use for hypothesis tests of means is based on three key assumptions. The population distributions are normally distributed. Matters for small sample sizes.. Independent scores. For a two-sample t-test, the scores are uncorrelated between populations. (We deal with this case soon.) 3. Homogeneity of ariance. For a two-sample t-test, the populations hae the same ariance (or standard deiation). If these assumptions do not hold, then the t-distribution that we calculate is not an accurate description of the sampling distribution. ROBUSTNESS? Deiation from normal distributions for the populations does not matter ery much, especially for large samples. If we run many tests, we keep the Type Ierrorrateprettyclosetowhatis intended by setting (e.g., =.5) Show in Robustness Simulation Demonstration in the textbook (.4) This is true for arying sample sizes to carry out hypothesis testing we need to calculate standard error to get standard error we need to estimate (or know) the standard deiation of the population since we sample two groups, we used a pooled estimate of to get a pooled estimate we need to be certain that = we need consider what happens when homogeneity does not hold 5 6 7 ROBUSTNESS? For a two-sample t-test, if n = n, then haing 6= does not matter ery much. If we run many tests, we keep the Type Ierrorrateprettyclosetowhatis intended by setting (e.g., =.5), especially for larger sample sizes Show in Robustness Simulation Demonstration in the textbook (.4) Shape of the population distributions does not matter ery much. ROBUSTNESS? For a two-sample t-test, if n 6= n, then haing 6= matters a lot. If we run many tests, we keep the Type I error rate is much di erent than what is intended by setting (e.g., =.5) Type I error rate is around 37% if big is paired with small n Type I error rate is around.% if big is paired with big n Show in Robustness Simulation Demonstration in the textbook (.4) Shape of the population distributions does not matter ery much. Our concern is about population ariances ( and )notabout sample ariances (s and s ) It is possible to do a hypothesis test for ariances H : = H a : 6= Note, it would be nice if we did not reject H,becausethenwecoulduse our original method if we reject H,wemustmakesome adjustments to hypothesis testing for the means 8 9 3
We are not actually going to do the hypothesis test for homogeneity of ariance It is messy and (a bit) confusing Just remember:. If the sample sizes are equal, then you are fine with the standard method.. If the sample sizes are unequal, then you might want to worry about homogeneity of ariance. If s s,thenyouareprobably also fine If you think you do not hae homogeneity of ariance, then you can run a reised ersion of the test. Some people (including your textbook) recommend this as the default approach. ONLUSIONS comparing two means independent samples more flexible than one-sample case many more experiments can be tested same basic technique NEXT TIME Welch s test Power Planning a replication study 3 3 33