Jian WANG, PhD. Room A115 College of Fishery and Life Science Shanghai Ocean University

Similar documents
Introduction to Analysis of Variance (ANOVA) Part 2

Jian WANG, PhD. Room A115 College of Fishery and Life Science Shanghai Ocean University

Worksheet 2 - Basic statistics

13: Additional ANOVA Topics

COMPARING SEVERAL MEANS: ANOVA

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

sphericity, 5-29, 5-32 residuals, 7-1 spread and level, 2-17 t test, 1-13 transformations, 2-15 violations, 1-19

ANOVA Multiple Comparisons

Comparing the means of more than two groups

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

ANALYSIS OF VARIANCE OF BALANCED DAIRY SCIENCE DATA USING SAS

Group comparison test for independent samples

Comparing Several Means: ANOVA

Stats fest Analysis of variance. Single factor ANOVA. Aims. Single factor ANOVA. Data

Introductory Statistics with R: Linear models for continuous response (Chapters 6, 7, and 11)

Analysis of variance (ANOVA) ANOVA. Null hypothesis for simple ANOVA. H 0 : Variance among groups = 0

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Introduction. Chapter 8

DESIGNING EXPERIMENTS AND ANALYZING DATA A Model Comparison Perspective

The entire data set consists of n = 32 widgets, 8 of which were made from each of q = 4 different materials.

Assignment #7. Chapter 12: 18, 24 Chapter 13: 28. Due next Friday Nov. 20 th by 2pm in your TA s homework box

An Old Research Question

(Foundation of Medical Statistics)

A posteriori multiple comparison tests

Laboratory Topics 4 & 5

Experimental Design and Data Analysis for Biologists

THE PRINCIPLES AND PRACTICE OF STATISTICS IN BIOLOGICAL RESEARCH. Robert R. SOKAL and F. James ROHLF. State University of New York at Stony Brook

Chapter Seven: Multi-Sample Methods 1/52

COMPARISON OF MEANS OF SEVERAL RANDOM SAMPLES. ANOVA

Transition Passage to Descriptive Statistics 28

Statistics for EES Factorial analysis of variance

Analysis of variance (ANOVA) Comparing the means of more than two groups

13: Additional ANOVA Topics. Post hoc Comparisons

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

Sleep data, two drugs Ch13.xls

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

SPSS Guide For MMI 409

Multiple Comparison Procedures Cohen Chapter 13. For EDUC/PSY 6600

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

4.1. Introduction: Comparing Means

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

One-way between-subjects ANOVA. Comparing three or more independent means

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

Analysis of Variance

Contents. Acknowledgments. xix

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Battery Life. Factory

Analysis of Variance (ANOVA)

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)


More about Single Factor Experiments

Preview from Notesale.co.uk Page 3 of 63

Statistics in Stata Introduction to Stata

Analysis of Variance (ANOVA)

Week 14 Comparing k(> 2) Populations

Part II { Oneway Anova, Simple Linear Regression and ANCOVA with R

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Orthogonal contrasts and multiple comparisons

EPSE 592: Design & Analysis of Experiments

Linear Combinations of Group Means

Design & Analysis of Experiments 7E 2009 Montgomery

One-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600

Types of Statistical Tests DR. MIKE MARRAPODI

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Chapter 12. Analysis of variance

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

Garvan Ins)tute Biosta)s)cal Workshop 16/7/2015. Tuan V. Nguyen. Garvan Ins)tute of Medical Research Sydney, Australia

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

1 One-way Analysis of Variance

What is a Hypothesis?

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

GLMM workshop 7 July 2016 Instructors: David Schneider, with Louis Charron, Devin Flawd, Kyle Millar, Anne St. Pierre Provencher, Sam Trueman

Extensions of One-Way ANOVA.

Contrasts (in general)

Notes on Maxwell & Delaney

Lec 3: Model Adequacy Checking

Lec 1: An Introduction to ANOVA

Dr. Junchao Xia Center of Biophysics and Computational Biology. Fall /8/2016 1/38

Difference in two or more average scores in different groups

Basic Statistical Analysis

Mean Comparisons PLANNED F TESTS

Review of the General Linear Model

Module 4: Regression Methods: Concepts and Applications

The ε ij (i.e. the errors or residuals) are normally distributed. This assumption has the least influence on the F test.

ANOVA CIVL 7012/8012

Linear Combinations. Comparison of treatment means. Bruce A Craig. Department of Statistics Purdue University. STAT 514 Topic 6 1

Review of Statistics 101

One-way between-subjects ANOVA. Comparing three or more independent means

Specific Differences. Lukas Meier, Seminar für Statistik

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Cuckoo Birds. Analysis of Variance. Display of Cuckoo Bird Egg Lengths

Exam details. Final Review Session. Things to Review

Module 9: Nonparametric Statistics Statistics (OA3102)

1 Introduction to Minitab

Workshop 7.4a: Single factor ANOVA

3. Nonparametric methods

Transcription:

Jian WANG, PhD j_wang@shou.edu.cn Room A115 College of Fishery and Life Science Shanghai Ocean University

Useful Links Slides: http://sihua.us/biostatistics.htm Datasets: http://users.monash.edu.au/~murray/bdar/index.html RStudio https://www.rstudio.com

RStudio friendly IDE for R

RStudio friendly IDE for R

RStudio friendly IDE for R

RStudio friendly IDE for R new script input scripts Enviroment & History plots & help results

Contents 1. Introduction to R 2. Data sets 3. Introductory Statistical Principles 4. Sampling and experimental design with R 5. Graphical data presentation 6. Simple hypothesis testing 7. Introduction to Linear models 8. Correlation and simple linear regression 9. Single factor classification (ANOVA) 10. Nested ANOVA 11. Factorial ANOVA 12. Simple Frequency Analysis

ANOVA (Analysis of variance) one-way ANOVA also known Single factor classification used to investigate the effect of single factor comprising of two or more groups from a completely randomized design eg: temperature concentration of drug A factor of four levels

ANOVA (Analysis of variance)

Example A: zinc contamination on the diversity of diatom species

Example A: zinc contamination on the diversity of diatom species Medley and Clements (1998) investigated the impact of zinc contamination (and other heavy metals) on the diversity of diatom species in the USA Rocky Mountains The diversity of diatoms (number of species) degree of zinc contamination (high, medium, low or natural background level) Data were recorded from between four and six sampling stations within each of six streams known to be polluted. These data were used to test the null hypothesis that there were no differences the diversity of diatoms between different zinc levels

F-ratios F-ratios and corresponding R syntax for single factor ANOVA designs Mean of squared (variation)

F -distribution Comparing the plots of the probability density function for an F distribution with various degrees of freedom. solid line represents the probability density functions (pdf) of F(1, 1), dashed line represents the pdf of F(2, 5), dotted line represents the pdf of F(10, 20)

F -distribution

F -distribution Eg. The density plot of F(3, 23)- distribution. The distribution of F statistic for the assuming that the null hypothesis is true. The observed value of the test statistic is f = 3.2, and the corresponding p-value is shown as the shaded area above 3.2

Fixed factor & Random factor Could be control Eg: three specific temperatures Couldn t be control Eg: three operators

Fixed factor the population group means are all equal or the effect of each group equals zero H : 0 1 either i

Random factor the variance between all possible groups equals zero added variance due to this factor equals zero H : 2 0 1

Linear model

Assumptions of ANOVA Hypothesis testing for a single factor ANOVA model assumes that the residuals (and therefore the response variable for each of the treatment levels) are all: (i) normally distributed (ii) equally varied (iii) independent of one another

Tests of trends and means comparisons When H0 is rejected Researchers often wish to examine patterns of differences among groups. However, this requires multiple comparisons of group means and multiple comparisons Post-hoc unplanned pairwise comparisons e.g. Bonferroni, LSR (Duncan, Neuman-Keuls), Tukey HSD Planned comparisons

ANOVA in R Model construction: lm() aov() View ANOVA table summary() anova()

Example A: zinc contamination on the diversity of diatom species

Example A: zinc contamination on the diversity of diatom species Medley and Clements (1998) investigated the impact of zinc contamination (and other heavy metals) on the diversity of diatom species in the USA Rocky Mountains The diversity of diatoms (number of species) degree of zinc contamination (high, medium, low or natural background level) Data were recorded from between four and six sampling stations within each of six streams known to be polluted. These data were used to test the null hypothesis that there were no differences the diversity of diatoms between different zinc levels

Example A: zinc contamination on the diversity of diatom species ## 1 - import dataset (notice the directory) >setwd() > medley <- read.table('medley.csv', header=t, sep=',') > medley #check data > boxplot(diversity~zinc, medley) not in proper order

Example A: zinc contamination on the diversity of diatom species ##2 - Reorganize the levels of categorical factor into more logical order >medley$zinc #1 st * > medley$zinc <- factor(medley$zinc, levels=c('high', 'MED', 'LOW', 'BACK'), ordered=f) >medley$zinc #2 nd * *find the difference between 1 st & 2 nd

Example A: zinc contamination on the diversity of diatom species ## 3 - Assess normality/homogeneity of variance using boxplot of species diversity against zinc group > boxplot(diversity~zinc, medley) Conclusions no obvious violations of normality or homogeneity of variance basically symmetrical

Example A: zinc contamination on the diversity of diatom species ##4 - Assess homogeneity of variance assumption with a plot of mean vs variance > plot(tapply(medley$diversity, medley$zinc, mean), tapply(medley$diversity, medley$zinc, var)) Conclusions no obvious relationship between group mean and variance

Example A: zinc contamination on the diversity of diatom species ## 3 - Assess normality using shapiro test of species diversity against zinc group > library("plyr") > ddply(medley,.(zinc), function(x) {data.frame(pvalue = shapiro.test(x$diversity)$p.value)})

Example A: zinc contamination on the diversity of diatom species ## 3 - Assess homogeneity of variance using Bartlett test of species diversity against zinc group > bartlett.test(medley$diversity~medley$zinc)

Example A: zinc contamination on the diversity of diatom species ##5 - Test H0 that population group means are all equal > medley.aov <- aov(diversity ~ ZINC, medley) > medley.aov

Example A: zinc contamination on the diversity of diatom species ##5 - Test H0 that population group means are all equal > par(mfrow = c(2, 2)) > plot(medley.aov) Conclusions - no obvious violations of normality or homogeneity of variance meaningless

Example A: zinc contamination on the diversity of diatom species ##6 - Examine the ANOVA table. > anova(medley.aov) > summary(medley.aov) MS B SS k B 1 degree of freedom k-1 N-k (N: total) F (k-1,n-k) ratio, MSB/MSw MS w SS w N k

Example A: zinc contamination on the diversity of diatom species ##7 option using linear model to do ANOVA > anova(lm(diversity ~ ZINC, medley))

Example A: zinc contamination on the diversity of diatom species ##6 - Examine the ANOVA table. > anova(medley.aov) > summary(medley.aov) Conclusions - at least one of the population group means differs from the others

Post-hoc unplanned pairwise comparison One-way ANOVA results : - Rejecting the H0 that all of population group means are equal only indicates that at least one of the population group means differs from the others. - However, it does not indicate which group differ from which other groups. - multiple comparisons of group means with correction are required.

Post-hoc unplanned pairwise comparison Problems of multiple comparisons : 1- multiple significant test increase the probability of Type I errors (α, the probability of falsely rejecting H0) eg: Type I errors of 5 groups 10 pairwise comparisons with α=0.05: 1-0.95^10=0.4 2- the outcome of each test might not be independent (orthogonal). eg: A>B, B>C. if A & B are different, we already know A & C are different multiple corrections are needed for comparisons

Example A: zinc contamination on the diversity of diatom species ##7 Post-hoc to investigate pairwise mean differences between all groups #option 1 > TukeyHSD(medley.aov ) #option 2 > require('multcomp') > summary(glht(medley.aov, linfct = mcp(zinc = "Tukey"))) #option 3 > require("desctools") > PostHocTest(medley.aov,method = "hsd") Tukey s Honestly Significant Distance test for multiple comparisons

Example A: zinc contamination on the diversity of diatom species ##7 Post-hoc between all groups to investigate pairwise mean differences

Example A: zinc contamination on the diversity of diatom species ##7 Post-hoc between all groups to investigate pairwise mean differences

Example A: zinc contamination on the diversity of diatom species ##7 Post-hoc between all groups to investigate pairwise mean differences

Example A: zinc contamination on the diversity of diatom species ##8 Summarize result with a bargraph using biology package not available now

Example A: zinc contamination on the diversity of diatom species ##8 Summarize result with a bargraph Add * symbol manually by Graphic software like adobe illustrator * > #calculate mean & sd seperately > mean1 <- tapply(medley$diversity,medley$zinc,mean) > sd1 <- tapply(medley$diversity,medley$zinc,sd) > dd1 <- data.frame(mean1,sd1) > ylim=c(0,(max(dd1$mean1)+2*max(dd1$sd1))) > mp <- barplot(dd1$mean1,ylab="diversity", xlab = "Zinc Concentration", names.arg=row.names(dd1),ylim=ylim) > segments(mp, dd1$mean1-dd1$sd1,mp,dd1$mean1+dd1$sd1) > segments( mp - 0.1,dd1$mean1-dd1$sd1, mp + 0.1,dd1$mean1-dd1$sd1) > segments( mp - 0.1,dd1$mean1+dd1$sd1, mp + 0.1,dd1$mean1+dd1$sd1)

Example A: zinc contamination on the diversity of diatom species ##8 Summarize result with a bargraph Using ggplot2 > library(reshape2) > library(ggplot2) > library(plyr) > mdata.m <- tapply(medley$diversity,medley$zinc,mean) > mdata.sd <- tapply(medley$diversity,medley$zinc,sd) > data.r = data.frame(mdata.m,mdata.sd) > data.r$zinc = row.names(data.r) > ggplot(data.r,aes(zinc,mdata.m,fill=zinc)) + geom_bar(stat = "identity",width = 0.5) + geom_errorbar(aes(ymin=mdata.m-mdata.sd, ymax=mdata.m+mdata.sd),width=0.2)+ scale_y_continuous(expand = c(0,0),limits=c(0,3),) + ##limits should be adjusted accordingly scale_x_discrete(limits=data.r$zinc)+ylab("diversity")+ theme_bw() + theme(panel.grid.major= element_blank(),panel.grid.minor=element_blank()) *

ggplot2: Elegant Graphics

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae Keough and Raimondi (1995) examined the effects of four biofilm types on the recruitment of serpulid larvae. : SL: sterile unfilmed substrate, NL: netted laboratory biofilms, UL: unnetted laboratory biofilms F: netted field biofilms Substrates treated with one of the four biofilm types were left in shallow marine waters for one week after which the number of newly recruited serpulid worms were counted.

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae The linear effect model would be:

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 1&2 - Check the assumptions and scale data if appropriate > keough <- read.table("keough.csv", header = T, sep = ",") > dev.off() ##if necessary > boxplot(serp ~ BIOFILM, data = keough) > boxplot(log10(serp) ~ BIOFILM, data = keough )

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 1&2 - Check the assumptions and scale data if appropriate untransformed log 10 scale

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 1&2 - Check the assumptions and scale data if appropriate > with(keough, plot(tapply(serp, BIOFILM, mean), tapply(serp, BIOFILM, var))) > with(keough, plot(tapply(log10(serp), BIOFILM, mean), tapply(log10(serp), BIOFILM, var)))

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 1&2 - Check the assumptions and scale data if appropriate untransformed log 10 scale Conclusions - some evidence of a relationship between population mean and population variance from untransformed data, log10 transformed data meets assumptions better, therefore transformation appropriate.

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae SL: sterile unfilmed substrate, NL: netted laboratory biofilms, UL: unnetted laboratory biofilms F: netted field biofilms Comparisons:

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 3&4 - Define a list of contrasts for the following planned comparisons: >keough$biofilm #1 st * > contrasts(keough$biofilm) <- cbind(c(0, 1, 0, -1), c(2, -1, 0, -1), c(-1, - 1, 3, -1)) >round(crossprod(contrasts(keough$biofilm)), 2) >keough$biofilm #2 nd * Conclusions - all defined planned contrasts are orthogonal (values above or below the cross-product matrix diagonal are all be zero). *notice the difference between 1st & 2nd

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction >keough.list <- list(biofilm = list('nl vs UL' = 1,'F vs (NL&UL)' = 2, 'SL vs (F&NL&UL)' = 3)) > keough.aov <- aov(log10(serp) ~ BIOFILM, data = keough) > par(mfrow = c(2, 2)) > plot(keough.aov) > summary(keough.aov, split=keough.list)

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction meaningless

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction Conclusions Biofilm treatments were found to have a significant affect on the mean log10 number of serpulid recruits (F3,24 = 6.0058,P = 0.003). The presence of a net (NL) over the substrate was not found to alter the mean log10 serpulid recruits compared to a surface without (UL) a net (F1,24 = 0.6352,P = 0.4332). Field biofilms (F) were not found to have different mean log10 serpulid recruits than the laboratory (NL, UL) biofilms (F1,24 = 0.6635,P = 0.4233). Unfilmed treatments were found to have significantly lower mean log10 serpulid recruits than treatments with biofilms (F1,24 = 16.719,P < 0.001)

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 5 - Define contrast labels and model construction Significant affects were found on: Overall biofilm treatments (F3,24 = 6.0058,P = 0.003). Unfilmed treatments and treatments with biofilms (F1,24 = 16.719,P < 0.001)

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae ## 6- Summarize findings with a bargraph > means <- with(keough, tapply(serp, BIOFILM, mean, na.rm = T)) > sds <- with(keough, tapply(serp, BIOFILM, sd, na.rm = T)) > n <- with(keough, tapply(serp, BIOFILM, length)) > ses <- sds/sqrt(n) > ys <- pretty(c(means - ses, means + (2 * ses))) > xs <- barplot(means, beside = T, axes = F, ann = F, ylim = c(min(ys), max(ys)), xpd = F) > arrows(xs, means + ses, xs, means - ses, ang = 90, length = 0.1, code = 3) axis(2, las = 1) > mtext(2, text = "Mean number of serpulids", line = 3, cex = 1.5) > mtext(1, text = "Biofilm treatment", line = 3, cex = 1.5) > box(bty = "l")

Example B : Single factor ANOVA with planned comparisons four biofilm types on the recruitment of serpulid larvae Mean number of serpulids ## 6- Summarize findings with a bargraph 200 180 160 140 120 100 80 F NL SL UL Biofilm treatment

Robust classification: alternatives to ANOVA either non-normality or unequal variance Welch s test adjusts the degrees of freedom to maintain test reliability in situations where populations are normally distributed but unequally varied. Kruskal-Wallis test : abnormality. Non-parametric (rank-based) tests Randomization tests : do not assume observations were collected via random sampling, however they do assume that populations are equally varied

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length was investigated: Control 2% glucose added 2% fructose added 1% glucose and 1% fructose added 2% sucrose added

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length ##1 Import data and check normality and equal variance > purves <- read.table('purves.csv', header=t, sep=',') > dev.off() > boxplot(length~treat, data=purves) unequal variance. Note: that this dataset would also suited to a Welch s test. for the purpose of providing worked examples that are consistent with popular biometry texts, a Kruskal-Wallis test will be demonstrated.

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length ##2 Perform non-parametric Kruskal-Wallis test > kruskal.test(length~treat, data=purves)

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length ##2 Perform post-hoc > pairwise.t.test(purves$length, purves$treat, pool.sd=f, p.adj= fdr") fdr: False discovery rate

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length ## Summarize findings with a bargraph > means <- with(purves, tapply(length, TREAT, mean, na.rm =T)) > sds <- with(purves, tapply(length, TREAT, sd, na.rm =T)) > n <- with(purves, tapply(length, TREAT, length)) > ses <- sds/sqrt(n) > ys <- pretty(c(means - ses, means + (2 * ses))) > xs<-barplot(means, beside=t, axes=f, ann=f, ylim = c(min(ys), max(ys)), xpd=f) > arrows(xs, means+ses, xs, means-ses, ang=90, length=0.05, code=3) > axis(2, las = 1) > mtext(2, text = "Mean pea length", line = 3, cex = 1.5) > mtext(1, text = "Sugar treatment", line = 3, cex = 1.5) > text(xs, means + ses, labels = c('a','b','b','b','c'), pos = 3) > box(bty="l")

Example E: Kruskal-Wallis test The effect of different sugar treatments on pea length ## Summarize findings with a bargraph

Example F: Welch s test The type of bird colony on beetle density The effects of sea birds on tenebrionid beetles on islands in the Gulf of California. sea birds leaving guano and carrion would increase beetle productivity. They had a sample of 25 islands and recorded the beetle density, the type of bird colony (roosting, breeding, no birds), % cover of guano and % plant cover of annuals and perennials

Example F: Welch s test The type of bird colony on beetle density ##1 Import data and check normality and equal variance sanchez <- read.table('sanchez.csv', header=t, sep=',') boxplot(guano~coltype, data=sanchez) boxplot(sqrt(guano)~coltype, data=sanchez)

Example F: Welch s test The type of bird colony on beetle density ##1 Import data and check normality and equal variance still unequal variance clear evidence that non-normality and non-homogeneity square-root transform improved a little

Example F: Welch s test The type of bird colony on beetle density ## Perform the Welch s test. > oneway.test(sqrt(guano)~coltype, data=sanchez) Significant difference. Reject the null hypothesis

Example F: Welch s test The type of bird colony on beetle density ## - Perform post-hoc test. > pairwise.t.test(sqrt(sanchez$guano), sanchez$coltype, pool.sd=f, p.adj="holm")

Example F: Welch s test The type of bird colony on beetle density ## - Perform post-hoc test. > pairwise.t.test(sqrt(sanchez$guano), sanchez$coltype, pool.sd=f, p.adj="none")

Single Factor Classification Methods ANOVA: Three assumptions satisfied Welch test: normality but NOT equally varied Kruskal-Wallis test: (non-parametric, test medians) abnormality Randomization tests: can NOT random sampling, but equally varied