Introduction to hypothesis testing

Similar documents
Analysis of variance (ANOVA) Comparing the means of more than two groups

Business Statistics. Lecture 10: Course Review

Analytical Graphing. lets start with the best graph ever made

Introduction to Linear regression analysis. Part 2. Model comparisons

Analytical Graphing. lets start with the best graph ever made

GROUPED DATA E.G. FOR SAMPLE OF RAW DATA (E.G. 4, 12, 7, 5, MEAN G x / n STANDARD DEVIATION MEDIAN AND QUARTILES STANDARD DEVIATION

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Wilcoxon Test and Calculating Sample Sizes

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Rigorous Science - Based on a probability value? The linkage between Popperian science and statistical analysis

This is particularly true if you see long tails in your data. What are you testing? That the two distributions are the same!

Worksheet 2 - Basic statistics

Chapter 7 Comparison of two independent samples

Exam details. Final Review Session. Things to Review

One-Sample and Two-Sample Means Tests

Tentative solutions TMA4255 Applied Statistics 16 May, 2015

Relating Graph to Matlab

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Statistics for Managers Using Microsoft Excel Chapter 9 Two Sample Tests With Numerical Data

Data Analysis and Statistical Methods Statistics 651

Inferences About the Difference Between Two Means

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Analysis of 2x2 Cross-Over Designs using T-Tests

Distribution-Free Procedures (Devore Chapter Fifteen)

Lecture 7: Hypothesis Testing and ANOVA

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Non-parametric methods

Statistics Handbook. All statistical tables were computed by the author.

Two-Sample Inferential Statistics

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Data Analysis and Statistical Methods Statistics 651

Comparison of Two Population Means

Chapter 23: Inferences About Means

2011 Pearson Education, Inc

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Resampling Methods. Lukas Meier

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Introduction to Statistics with GraphPad Prism 7

Stat 427/527: Advanced Data Analysis I

Chapter Fifteen. Frequency Distribution, Cross-Tabulation, and Hypothesis Testing

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Statistics Primer. ORC Staff: Jayme Palka Peter Boedeker Marcus Fagan Trey Dejong

df=degrees of freedom = n - 1

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Nonparametric Statistics

Do not copy, post, or distribute. Independent-Samples t Test and Mann- C h a p t e r 13

Introduction to Statistical Data Analysis III

Name: Biostatistics 1 st year Comprehensive Examination: Applied in-class exam. June 8 th, 2016: 9am to 1pm

Lecture 26: Chapter 10, Section 2 Inference for Quantitative Variable Confidence Interval with t

Political Science 236 Hypothesis Testing: Review and Bootstrapping

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Glossary for the Triola Statistics Series

Non-Parametric Two-Sample Analysis: The Mann-Whitney U Test

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Transition Passage to Descriptive Statistics 28

Regression: Main Ideas Setting: Quantitative outcome with a quantitative explanatory variable. Example, cont.

9 One-Way Analysis of Variance

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Chapter 7 Class Notes Comparison of Two Independent Samples

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Rama Nada. -Ensherah Mokheemer. 1 P a g e

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Statistics: revision

An inferential procedure to use sample data to understand a population Procedures

36-309/749 Experimental Design for Behavioral and Social Sciences. Sep. 22, 2015 Lecture 4: Linear Regression

Review of Statistics 101

Bio 183 Statistics in Research. B. Cleaning up your data: getting rid of problems

Business Statistics. Lecture 5: Confidence Intervals

The t-statistic. Student s t Test

Business Statistics MEDIAN: NON- PARAMETRIC TESTS

Chapter 24. Comparing Means

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

2. RELATIONSHIP BETWEEN A QUALITATIVE AND A QUANTITATIVE VARIABLE

WELCOME! Lecture 13 Thommy Perlinger

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Chapter 23. Inferences About Means. Monday, May 6, 13. Copyright 2009 Pearson Education, Inc.

Agonistic Display in Betta splendens: Data Analysis I. Betta splendens Research: Parametric or Non-parametric Data?

Final Exam - Solutions

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Psychology 282 Lecture #4 Outline Inferences in SLR

Nonparametric Statistics. Leah Wright, Tyler Ross, Taylor Brown

16.400/453J Human Factors Engineering. Design of Experiments II

+ Specify 1 tail / 2 tail

Introduction to Analysis of Variance (ANOVA) Part 2

HYPOTHESIS TESTING. Hypothesis Testing

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Sampling Distributions: Central Limit Theorem

The Difference in Proportions Test

Transcription:

Introduction to hypothesis testing Review: Logic of Hypothesis Tests Usually, we test (attempt to falsify) a null hypothesis (H 0 ): includes all possibilities except prediction in hypothesis (H A ) If hypothesis (H A )is that an experimental treatment has an effect: null hypothesis is that there is no effect Disproving H 0 = evidence that actual hypothesis is true

Decision criterion How low a probability should make us reject H 0? If probability is less than significance level (critical p-value, ), then reject H 0 ; otherwise do not reject Convention sets significance level: = 0.05 (5%) Arbitrary: other significance levels might be valid. Context specific Three special types of Hypothesis Tests based on the t distribution 1. The mean of a distribution is different from a constant (one sample t test) 2. The mean difference in pairs of observations is different from a constant (paired t test) 3. Two distributions differ (i.e. the means from two sets of observations do not come from the same distribution of means). Two sample t test.

t statistic General form of t statistic: S t SE where S t is sample statistic, is parameter value specified in H 0 and SE is standard error of sample statistic. Specific form for population mean: y s n Value of mean specified in H 0 Test statistics Sampling distributions of t, one for each sample size, when H 0 true use degrees of freedom (df = n -1) Area under each sampling (probability) distribution equals one Probabilities of obtaining particular ranges of t when H 0 is true

Three special types of Hypothesis Tests based on the t distribution 1. The mean of a distribution is different from a constant. One sample t test 2. The mean difference in pairs of observations is different from a constant. Paired t test. 3. Two distributions differ (ie the means from two sets of observations do not come from the same distribution of means). Two sample t test. Simple null hypothesis Test of hypothesis that population mean equals a particular value (H 0 : = ) These values may be from literature or other research or legislation

One sample t-test Mean(B_To_D) 4 3.5 3 2.5 2 1.5 1 0.5 0 Europe Islamic NewWorld Group Populations are fairly stable if the ratio of births to deaths is close to 1.25. H o : B/D ratios = 1.25 H A : B/D ratios = 1.25 1) Are the B/D ratios for any of these groups =1.25 2) Test using a one sample t-test Ourworld t statistic General form of t statistic: S t SE where S t is sample statistic, is parameter value specified in H 0 and SE is standard error of sample statistic. Specific form for population mean: y s n Value of mean specified in H 0

One sample t-tests Single population: H 0 : = 0 (or any other pre-specified value: here 1.25) t y 1.25 s y df = n -1 y 1.25 s n Results Europe 1.7 1.6 1.5 1.4 1.3 1.2 1.1 1. Box plot 2. Normal approximation 3. Histogram 1 0.9 0.05 0.15 0.25 Probability

More Results Islamic Test Mean Hypothesized Value Actual Estimate DF Std Dev Test Statistic Prob > t Prob > t Prob < t t Test 7.5570 <.0001* <.0001* 1.0000 1.25 3.47825 15 1.17943 New World Test Mean Hypothesized Value 1.25 Actual Estimate 3.95091 DF 20 Std Dev 1.50949 Test Statistic Prob > t Prob > t Prob < t t Test 8.1995 <.0001* <.0001* 1.0000-1 0 1 2 3 4-2 -1 0 1 2 3 4 Even more a way to present the results 8 Births / deaths (95% CI) 7 6 5 4 3 2 1 0 Ho:

Two sample t- test Used to compare two populations, each of which has been sampled The simplest form of tests among multiple populations Example: does the average annual income differ for males and females: Ho: income (males) = income (females) 25 20 15 10 5 0 Female Male Survey2 SEX Calculation: H 0 : 1 = 2, i.e. 1-2 = 0 - independent observations t y 1 y2 ( 1 2) s y y 1 2 y 1 y s y y 1 2 2 y 1 y 2 1 1 s + p n 1 n 2 Where s p = the pooled standard deviation (more later), and df = (n 1-1) + (n 2-1) = n 1 + n 2-2

Logic of the two sample t test Assume H o : = 2 H A : > 2 1) If H o is true then the null distribution is known (for a set df) 2) If H A is true, we don t know the distribution but we do know that it is not the null distribution Probability of t 0.4 0.3 0.2 0.1 H o true 0.0-5 -4-3 -2-1 0 1 2 3 4 5 t = Central t s p y 1 H A true Non- Central t y 2 1 1 + n 1 n 2 6 7 8 9 Assume: H o : = 2, 4 df 0.4 0.3 H o true t 0.05, 4 df = 2.14 0.2 0.1 0.0-5 -4-3 -2-1 0 1 2 3 4 5 6 7 8 9 t = s p y 1 y 2 1 1 + n 1 n 2 Any t >2.14 will lead to incorrect rejection of H o 1. This means that the difference between y 1 and y 2 is > than 2.14 standard errors (pooled) 2. This will happen 5 % of the time

Assume: H A : > 2, 4 df 0.4 0.3 H A true t 0.05, 4 df = 2.14 0.2 0.1 0.0-5 -4-3 -2-1 0 1 2 3 4 5 6 7 8 9 t = s p y 1 y 2 1 1 + n 1 n 2 Any t < 2.14 will lead to incorrect rejection of H A 1. This means that the difference between y 1 and y 2 is < than 2.14 standard errors (pooled) 2. The probability that this will happen is dependent on n and the true difference between and Results of example What is the conclusion? Difference in Means The unequal variance t-test is based on the Satterthwaite adjustment (of degrees of freedom), it is not recommended unless the variance terms are very different and the sample sizes (n) are very different Difference in Means

70 Female 70 Male 25 60 50 40 30 20 60 50 40 30 20 Annual Income (mean +- SE) 20 15 10 5 10 10 0 0 0 Female Male SEX Paired t tests: The logic of 1. Often there is interest in comparisons of observations that can be considered paired within a subject or replicate a) For example: i. A comparison of activity level before and after eating in the ii. same individual A comparison of longevity of males vs females,where county is the replicate 2. In such cases there is often benefit in accounting for variance that could be caused by differences among subjects (or replicates)

Paired observations: Paired t- test H 0 : d = 0 where d is difference between between paired observations t d s d d s d n d Where s d = standard deviation of the sample of differences, and df = n - 1 where n is number of pairs Paired t-test example II Pisaster comes in two colors along the west coast: purple and orange: H o : density of purple per site = density of orange Individual reefs are the replicates of interest Looks like a no brainer 1200 1000 800 Density 600 400 200 0 Orange Purple COLOR Sea star colors all sites two sample

Results of a 2 sample test Standard GROUP N Mean Deviation -------+-------------------------- Orange 7 144.71429 101.75086 Purple 7 457.28571 353.47829 Pooled Variance Difference in Means : -312.57143 95.00% Confidence Interval : -615.48591 to -9.65695 t : -2.24827 df : 12.00000 p-value : 0.04413 1200 1200 Marginally significant WHY? 1000 1000 NUMBER 800 600 400 Density (95% CI) 800 600 400 200 0 9 8 7 6 5 4 3 2 1 0 1 2 3 4 5 6 7 8 9 Count Count COLOR Orange Purple 200 0 Orange Purple Color of seastars Consider the variability added at the level of replicate (site) Given that observations are paired at the level of site can this be accounted for 1200 1200 1200 1000 1000 1000 800 800 800 Density 600 Density 600 Density 600 400 400 400 200 0 Orange Purple COLOR 200 0 Govpt Boat Stair Shell Beach Site Hazards Cayucos PSN 200 0 Govpt Boat Stair Shell Beach SITE Hazards Cayucos PSN COLOR Orange Purple

Paired test: Details of calculation 1200 Site Purple Orange difference Govpt 1023 306 717 Boat 585 155 430 Stair 476 143 333 PSN 233 142 91 Cayucos 107 31 76 Hazards 728 222 506 Shell Beach 49 14 35 mean 312.5714 Sediff 97.25882 t 3.21381 Value 1000 800 600 400 200 0 ORANGE Index of Case PURPLE Note slopes are they the same: Perhaps rates are a better comparison 1) Convert to rates or 2) Log transform Paired test: Details of calculation: use of Log transformed data Site Purple(log) Orange(log) difference Govpt 3.0098756 2.4857214 0.524154 Boat 2.7671559 2.1903317 0.576824 Stair 2.677607 2.155336 0.522271 PSN 2.3673559 2.1522883 0.215068 Cayucos 2.0293838 1.4913617 0.538022 Hazards 2.8621314 2.346353 0.515778 Shell Beach 1.6901961 1.146128 0.544068 Value 3.5 3.0 2.5 2.0 1.5 mean 0.490884 Sediff 0.046604 1.0 t 10.53299 LORANGE LPURPLE Index of Case Note slopes much more similar Indicates that: 1) Purples are more common By a constant ratio rather than by a constant amount

Review calculations of t for One sample test y s n Two sample test Paired test y 1 y 2 s 1 p s p 1 + n 1 d s d n 2 n d Calculations of Standard Error 1) One sample t-test s n 2 S = SS (n-1) 2) Paired t-test s d n d 2 S d = SS d (n d -1) 3) Two sample t- test (calculation based on pooled variance term) 1 1 s p n 1 n 2 2 + S p = SS 1 +SS 2 (n 1-1)+(n 2-1) = SS 1 +SS 2 (n 1 +n 2-2)

Testing statistical null hypotheses Hypothesis construction

General Hypothesis A hypothesis that addresses the general question of interest H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface Specific hypotheses A hypothesis that represents the specific question addressed in your study. The specifics include Location of study Time period Replication Simple description of design

Specific Hypothesis H o : There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B H A : There will be a difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B Note much of this can be placed in the methods section, which would alleviate the need to state these details. However, also note that the hypotheses above are actually what are being tested Depiction of hypotheses H o : There will be no difference in the density of (species name) on vertical vs horizontal surfaces based on 10 replicate quadrats for each treatment randomly placed within site A sampled on date B Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect - 0 + Horizontal Density Vertical Density of Urchins

Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect - 0 + Horizontal Density Vertical Density of Urchins Depiction of hypotheses: what should the units be? Goal To use same units for all assessments irrespective of species or system To have same set of probabilities based on those units Hence - units should link to estimate of confidence Most common form are t-values, which provide an estimate of the difference in mean values calibrated by an estimate of error in the assessment of the mean values

T- statistic T X X SE 1 2 (Standard error) SE and SD SD N N i (Standard deviation) (Number of replicates) X X i 2 N 1 30 40 45 37 X 38.000 SD 6.272 SE 3.136 Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect - 0 + T = Horizontal Density Vertical Density of Urchins SE

Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE T-distribution (central t) is a null probability distribution Depicts the probability that the null hypothesis is correct One use is to estimate confidence levels

Depiction of hypotheses: H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE Depiction of hypotheses: what should the units be? H o Increasing likelihood that Ho is incorrect Increasing likelihood that Ho is incorrect -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE

H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE

H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces Including error yields a confidence interval e.g. 95% confident that the true t value is between. 95% CI -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 2.5% 95% CI 2.5% -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE

The importance of directionality of the alternative hypothesis (H A ) Consider: H o : There will be no difference in the density of urchins on vertical vs horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surfaces vs H o1 : Urchin density on horizontal surfaces will be greater than or equal to that on vertical surfaces H A1 : Urchins will be more dense on vertical than on horizontal surfaces H o1 : Urchin density on horizontal surfaces will be greater than or equal to that on vertical surfaces 100% CI 5% 95% CI -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE

H A1 : Urchins will be more dense on vertical than on horizontal surfaces 100% CI 5% 95% CI -3-2 -1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE One vs two tailed hypotheses- 1. Which is more interesting? 2. Which is more informed? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% -3-2 -1 0 1 2 3-3 -2-1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE

One vs two tailed hypotheses- 1. Which is more powerful? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% -3-2 -1 0 1 2 3-3 -2-1 0 1 2 3 T = Horizontal Density Vertical Density of Urchins SE Example T Replication on horizontal and vertical surfaces = 50 (100 total) Mean on Horizontal surfaces = 33.54 Mean on Vertical Surfaces = 45.31 Pooled standard deviation = 66.49 X h X v 33.54 45.32 T 1. 79 SE 66.49 100

One vs two tailed hypotheses- 1. Which is more powerful? H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% -3-2 -1 0 1 2 3-3 -2-1 0 1 2 3 T= -1.79, p=0.04 T= -1.79, p=0.08 T = Horizontal Density Vertical Density of Urchins SE One vs two tailed hypotheses -Conversion to original units H A1 : Urchins will be more dense on vertical than on horizontal surfaces H A : There will be a difference in the density of urchins on vertical vs horizontal surface 100% CI 100% CI 5% 95% CI 2.5% 95% CI 2.5% -19.5-13.3-6.65 0 6.65 13.3 19.5-19.5-13.3-6.65 0 6.65 13.3 19.5 Difference = -11.78, p=0.04 Difference = -11.78, p=0.08 Horizontal Density Vertical Density of Urchins

This is the difference between 1 and 2 tailed hypotheses make sure you know which you are dealing with Always strive for one tailed hypotheses Is there a directional prediction (eg > or separately <) One tailed If not Two tailed Assumptions of t test The t test is a parametric test The t statistic only follows t distribution if: variable has normal distribution (normality assumption) two groups have equal population variances (homogeneity of variance assumption) observations are independent or specifically paired (independence assumption)

Normality assumption Data in each group are normally distributed Checks: Frequency distributions be careful Boxplots Probability plots formal tests for normality Solutions: Transformations Don t worry run it anyway just kidding but not entirely Homogeneity of variance Population variances equal in 2 groups Checks: subjective comparison of sample variances boxplots F-ratio test of H 0 : 12 = 2 2 Solutions Transformations Don t worry run it anyway just kidding again but again not entirely

F-test on variances H 0 : 12 = 2 2 F statistic (F-ratio) = ratio of 2 sample variances F = s 12 / s 2 2 Reject H 0 if F < or > 1 If H 0 is true, F-ratio follows F distribution Usual logic of statistical test Boxplot Median 25% of values 25% of values Smallest value Largest value 50 100 150 200 250 300 350 LENGTH

70 60 50 Count 40 30 20 10 0 0 10 20 30 40 50 60 70 80 90 Limpet numbers per quadrat 1. IDEAL 2. SKEWED 3. OUTLIERS 4. UNEQUAL VARIANCES * * * * *

200 0.96 0.93 0.9 0.84 0.75 0.6 0.45 0.3 0.18 0.12 0.08 0.05 0.02 Use of transformations to control departures from normality and homogeneity of variances assumptions 0 50 100 150 Pop_1990 0.96 0.93 0.9 0.84 0.75 0.6 0.45 0.3 0.18 0.12 0.08 0.05 0.02 3 0.2 0.4 1 2 3 4 6 10 20 30 50 100 200 Pop_1990 Variance Pop_1990 Lpop1990 Europe 441 0.17 Islamic 1378 0.30 Newworld 1042 0.34 Greatest ratio 3.12-1 2-1 150 2 POP_1990 100 LPOP1990 1 50 0 0 Europe Islamic GROUP NewWorld -1 Europe Islamic GROUP NewWorld Ourworld Nonparametric tests Usually based on ranks of the data H 0 : samples come from populations with identical distributions equal means or medians Don t assume particular underlying distribution of data normal distributions not necessary Equal variances and independence still required Typically much less powerful than parametric tests

Mann-Whitney-Wilcoxon test Calculates sum of ranks in 2 samples should be similar if H 0 is true Compares rank sum to sampling distribution of rank sums distribution of rank sums when H 0 true Equivalent to t test on data transformed to ranks Additional slides

A brief digression to re-sampling theory Number inside Number outside 3 10 5 7 2 9 8 12 7 8 Mean 5 9.2 Traditional evaluation would probably involve a t test: another approach is re-sampling. Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment

Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment 3) Calculate mean for each group 7.6

Resampling Treatment Number Inside 3 Inside 5 Inside 2 Inside 8 Inside 7 Outside 10 Outside 7 Outside 9 Outside 12 Outside 8 1) Assume both treatments come from the same distribution 2) Resample groups of 5 observations, with replacement, but irrespective of treatment 3) Calculate mean for each group 4) Repeat many times 5) Calculate differences between pairs of means (remember the null hypothesis is that there is no effect of treatment). This generates a distribution of differences. Mean 1 Mean 2 Difference 8 7.8 0.2 5.6 8.2 2.6 6 9 3 8 5 3 6 6 0 7 8 1 6 6.8 0.8 8 7.2 0.8 8 6.6 1.4 7 8.4 1.4 6 5.4 0.6 7 6.4 0.6 6.4 6.8 0.4 5 3.4 1.6 6.8 4.8 2 6.4 7.2 0.8 7.2 8 0.8 6.4 4.6 1.8 8.4 6 2.4 7.4 6.6 0.8 5.6 8.4 2.8 8.2 6.2 2 7.8 8.4 0.6 8.6 6.6 2 6 10.2 4.2 6.8 5.6 1.2 6.4 7.8 1.4 7.2 4.8 2.4 6.6 7.2 0.6 7 5.2 1.8 6.6 9.8 3.2 8.4 7.8 0.6 Number of Observations 250 200 150 100 50 Distribution of differences 0 1000 observations -10-5 0 5 10 Difference in Means OK, now what? 0.2 Proportion 0.1 per Bar 0.0

Compare distribution of differences to real difference Number inside Number outside 3 10 5 7 2 9 8 12 7 8 Mean 5 9.2 Real difference = 4.2 Estimate likelihood that real difference comes from two similar distributions Proportion of differences less than Mean 1 Mean 2 Difference current 10.2 3.6 6.6 1 10 3.8 6.2 0.999 10.2 4.4 5.8 0.998 9.2 3.6 5.6 0.997 9.8 4.8 5 0.996 8.8 4.2 4.6 0.995 9.6 5.2 4.4 0.994 9.8 5.6 4.2 0.993 9.8 5.8 4 0.992 9.4 5.4 4 0.991 And on through 1000 differences Likelihood is 0.007 that distributions are the same What are constraints of this sort of approach?

T-test vs resampling Test P-value Resampling 0.007 T-test 0.0093 Why the difference? Additional examples

Worked example Fecundity of predatory gastropods: sample of 37 and 42 egg capsule of Lepsiella from littorinid zone and mussel zone respectively Counted number of eggs per capsule Null hypothesis: no difference between zones in mean number of eggs per capsule Ward & Quinn (1988), qk2002 Box 3.1 Specify H 0 and choose test statistic: H 0 : M = L, i.e. population mean number of eggs per capsule from both zones are equal The t statistic is appropriate test statistic for comparing 2 population means

Specify a priori significance (probability) level (): By convention, use = 0.05 (5%). Collect data, check assumptions, calculate test statistic from sample data: Mean SD n Littorinid: 8.70 3.03 37 Mussel: 11.36 2.33 42 t = -5.39, df = 77

Compare value of t statistic to its sampling distribution, the probability distribution of statistic (for specific df) when H 0 is true what is probability of obtaining t value of 5.39 or greater from a t distribution with 77 df? what is probability of taking samples with observed or greater mean difference from 2 populations with same means? Probability (from JMP) P = 0.001 Look up in t table P < 0.05

If probability of obtaining this value or larger is less than, conclude H 0 is unlikely to be true and reject it: statistically significant result Our probability (<0.001) is less than 0.05 so reject H 0 : statistically significant result. If probability of obtaining this value or larger is greater than, conclude that H 0 is likely to be true and do not reject it: statistically non-significant result

Presenting results of t test Methods: An independent t test was used to compare the mean number of eggs per capsule from the two zones. Assumptions were checked with. Results: The mean number of eggs per capsule from the mussel zone was significantly greater than that from the littorinid zone (t = 5.39, df = 77, P < 0.001; see Fig. 2).