Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018

Similar documents
Power analysis examples using R

Non-Gaussian Response Variables

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Mixed models in R using the lme4 package Part 7: Generalized linear mixed models

Parametric Modelling of Over-dispersed Count Data. Part III / MMath (Applied Statistics) 1

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Generalized Linear Mixed-Effects Models. Copyright c 2015 Dan Nettleton (Iowa State University) Statistics / 58

Mixed models in R using the lme4 package Part 5: Generalized linear mixed models

Workshop 9.3a: Randomized block designs

Stat 5303 (Oehlert): Randomized Complete Blocks 1

lme4 Luke Chang Last Revised July 16, Fitting Linear Mixed Models with a Varying Intercept

Analysis of binary repeated measures data with R

Outline. Mixed models in R using the lme4 package Part 5: Generalized linear mixed models. Parts of LMMs carried over to GLMMs

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

Solution Anti-fungal treatment (R software)

A brief introduction to mixed models

Contents. 1 Introduction: what is overdispersion? 2 Recognising (and testing for) overdispersion. 1 Introduction: what is overdispersion?

Random and Mixed Effects Models - Part II

PAPER 206 APPLIED STATISTICS

Generalized linear models

A Handbook of Statistical Analyses Using R 2nd Edition. Brian S. Everitt and Torsten Hothorn

Analysis of Count Data A Business Perspective. George J. Hurley Sr. Research Manager The Hershey Company Milwaukee June 2013

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Homework 5: Answer Key. Plausible Model: E(y) = µt. The expected number of arrests arrests equals a constant times the number who attend the game.

UNIVERSITY OF TORONTO Faculty of Arts and Science

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Three Factor Completely Randomized Design with One Continuous Factor: Using SPSS GLM UNIVARIATE R. C. Gardner Department of Psychology

R Output for Linear Models using functions lm(), gls() & glm()

Stat 8053, Fall 2013: Multinomial Logistic Models

SPSS Guide For MMI 409

Mixed models in R using the lme4 package Part 2: Longitudinal data, modeling interactions

A strategy for modelling count data which may have extra zeros

PAPER 218 STATISTICAL LEARNING IN PRACTICE

Introduction to Within-Person Analysis and RM ANOVA

Tento projekt je spolufinancován Evropským sociálním fondem a Státním rozpočtem ČR InoBio CZ.1.07/2.2.00/

Week 7 Multiple factors. Ch , Some miscellaneous parts

Mohammed. Research in Pharmacoepidemiology National School of Pharmacy, University of Otago

A Handbook of Statistical Analyses Using R. Brian S. Everitt and Torsten Hothorn

Workshop 7.4a: Single factor ANOVA

Correlated Data: Linear Mixed Models with Random Intercepts

BIOL 458 BIOMETRY Lab 9 - Correlation and Bivariate Regression

Answer to exercise: Blood pressure lowering drugs

Value Added Modeling

C. J. Schwarz Department of Statistics and Actuarial Science, Simon Fraser University December 27, 2013.

How to deal with non-linear count data? Macro-invertebrates in wetlands

Mixed effects models

Overdispersion Workshop in generalized linear models Uppsala, June 11-12, Outline. Overdispersion

Mixed Model Theory, Part I

Correlations. Notes. Output Created Comments 04-OCT :34:52

BIO5312 Biostatistics Lecture 13: Maximum Likelihood Estimation

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

Using R formulae to test for main effects in the presence of higher-order interactions

Multivariate Statistics in Ecology and Quantitative Genetics Summary

SAS Syntax and Output for Data Manipulation: CLDP 944 Example 3a page 1

Q30b Moyale Observed counts. The FREQ Procedure. Table 1 of type by response. Controlling for site=moyale. Improved (1+2) Same (3) Group only

Booklet of Code and Output for STAD29/STA 1007 Midterm Exam

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Model selection and comparison

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

Review: what is a linear model. Y = β 0 + β 1 X 1 + β 2 X 2 + A model of the following form:

R Package glmm: Likelihood-Based Inference for Generalized Linear Mixed Models

Analyses of Variance. Block 2b

Multiple Linear Regression. Chapter 12

Alain F. Zuur Highland Statistics Ltd. Newburgh, UK.

over Time line for the means). Specifically, & covariances) just a fixed variance instead. PROC MIXED: to 1000 is default) list models with TYPE=VC */

The Difference in Proportions Test

Section Poisson Regression

A discussion on multiple regression models

36-463/663: Hierarchical Linear Models

Generalised linear models. Response variable can take a number of different formats

Descriptive Statistics

Package HGLMMM for Hierarchical Generalized Linear Models

Analysis of means: Examples using package ANOM

Normal distribution We have a random sample from N(m, υ). The sample mean is Ȳ and the corrected sum of squares is S yy. After some simplification,

Subject-specific observed profiles of log(fev1) vs age First 50 subjects in Six Cities Study

Stat 5303 (Oehlert): Balanced Incomplete Block Designs 1

Random and Mixed Effects Models - Part III

Logistic Regressions. Stat 430

36-463/663: Multilevel & Hierarchical Models

Statistics 512: Solution to Homework#11. Problems 1-3 refer to the soybean sausage dataset of Problem 20.8 (ch21pr08.dat).

A. A Brief Introduction to Mixed-Effects Models

STATISTICS 110/201 PRACTICE FINAL EXAM

APPENDICES TO Protest Movements and Citizen Discontent. Appendix A: Question Wordings

Stat 5303 (Oehlert): Tukey One Degree of Freedom 1

Comparing Nested Models

Stat/F&W Ecol/Hort 572 Review Points Ané, Spring 2010

ESP 178 Applied Research Methods. 2/23: Quantitative Analysis

Poisson Regression. The Training Data

Comparing Several Means: ANOVA

Poisson Regression. James H. Steiger. Department of Psychology and Human Development Vanderbilt University

Linear Regression Models P8111

Reaction Days

Stat 5102 Final Exam May 14, 2015

Statistics 203 Introduction to Regression Models and ANOVA Practice Exam

Introduction and Background to Multilevel Analysis

STAT3401: Advanced data analysis Week 10: Models for Clustered Longitudinal Data

Linear Regression. In this lecture we will study a particular type of regression model: the linear regression model

Introduction to SAS proc mixed

Outline. Statistical inference for linear mixed models. One-way ANOVA in matrix-vector form

Transcription:

Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018 Introduction The current study investivates whether the mosquito species Aedes albopictus preferentially lays it s eggs in water in containers that already have mosquito eggs, and if so, does the amount of eggs or sepcies of eggs matter. The study took place at three houses, where 6 mosquito traps were deployed each week. Each of the 6 mosquito traps had a different treatment: - Control just fish food (2 traps) - 20 Ae. aegypti larvae + fish food - 20 Ae. albopictus larvae + fish food - 100 Ae. aegypti larvae + fish food - 100 Ae. albopictus larvae + fish food At the begining of the week, blood-fed, gravid mosquitos that were marked with day-glo dust were released into the house. Resarchers returned for the following 4 days too record the number of marked mosquitos in each of the 6 traps. Hypothesis: Ae. albopictus prefers to lay eggs in containers with Ae. aegypti larvae because it is proof of high-quality habitat, and this increases competition between the two species. Data The data consist of 312 observations of 10 variables. Each row of data represents a trap at a given day. For each row of data, we are provided with the release number (week), the date of collection, the date of release, the time of release, the House, the trapid, the number of larvae in the trap, the species of larvae in the trap, the age of the trap larvae, and how many marked mosquitos were found. Research Question The research question is whether the number of mosquitos layed in each trap type varies. If so, is mosquito species more important or is the number of eggs more important? 1

Descriptive Statistics and Data manipulation 0.4 Marked A. albopictus recaptured by trap larval species and density Marked albopictus recaptured 0.3 0.2 0.1 Species.of.trap.larvae aegypti albopictus none 0 20 100 Number of larvae in traps The predictors of interest are Number.of.trap.larvae and Species.of.trap.larvae. aegypti albopictus none 0 0 0 104 20 52 52 0 100 52 52 0 Note these variables are not fully crossed. There are in fact 5 unique treatments, but they are not wellexplained bynumber.of.trap.larvae and Species.of.trap.larvae. There is no way to explain these treatments in a fully crossed way using two variables. I added a new treatment variable which defines the 5 unique treatments in one variable. 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus 104 52 52 52 52 Note that the 0_none treatment has twice as many observations because two traps at each house had that treatment. How does this treatment variable distribute among houses and release numbers? 6/11/2018 6/15/2018 7/15/2018 7/2/2018 7/9/2018 2 72 0 0 0 0 3 0 24 0 0 0 2

4 0 0 0 72 0 5 0 0 0 0 72 6 0 0 72 0 0 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus Lucas Bueno 32 16 16 16 16 Lucas Malo 32 16 16 16 16 Mia 40 20 20 20 20 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus 2 24 12 12 12 12 3 8 4 4 4 4 4 24 12 12 12 12 5 24 12 12 12 12 6 24 12 12 12 12 2 3 4 5 6 Lucas Bueno 24 0 24 24 24 Lucas Malo 24 0 24 24 24 Mia 24 24 24 24 24 Release number is synonymous with release date. We will use Release number because it is shorter and sorts correctly as entered. House Mia had an extra release date (3) which neither of the other houses had. This causes some imbalance in sample size, but there is sufficient sampling done for all houses. Consider the response variable Histogram of mark$marked.albo.fem Frequency 0 100 250 0 1 2 3 4 mark$marked.albo.fem 0 1 2 3 4 270 26 7 6 3 3

The response variable is highly zero-inflated. This could cause modeling issues. As per discussion on XXX date, it is not of interest to keep track of the counts on specific days following a release. We will aggregate the data by summing the counts on all four days following a release. This results 78 observations, where each observation is all the mosquitos collected for a release date at a given trap at a given house. By summing the counts, we reduce the number of zeros in the data. Histogram of markagg$marked.albo.fem Frequency 0 20 40 0 1 2 3 4 5 markagg$marked.albo.fem 0 1 2 3 4 5 43 16 8 7 3 1 With the aggregated data, we have a single observation for a given trap, house and release.number,, House = Lucas Bueno Release.number Treatment 2 3 4 5 6 0_none 2 0 2 2 2 20_aegypti 1 0 1 1 1 20_albopictus 1 0 1 1 1 100_aegypti 1 0 1 1 1 100_albopictus 1 0 1 1 1,, House = Lucas Malo Release.number Treatment 2 3 4 5 6 0_none 2 0 2 2 2 20_aegypti 1 0 1 1 1 20_albopictus 1 0 1 1 1 100_aegypti 1 0 1 1 1 100_albopictus 1 0 1 1 1,, House = Mia Release.number Treatment 2 3 4 5 6 4

0_none 2 2 2 2 2 20_aegypti 1 1 1 1 1 20_albopictus 1 1 1 1 1 100_aegypti 1 1 1 1 1 100_albopictus 1 1 1 1 1 The below plot shows how number of recaptured mosquitos varies by all important factors- Treatment, House and Release Date. marked albopictus recaptured by trap larval species and density Marked albopictus recaptured 1.2 0.8 0.4 1.2 0.8 0.4 1.2 0.8 0.4 2 3 4 5 6 Lucas Bueno Lucas Malo Mia 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus Treatment 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus Model The main predictor of interest is Treatment. To address non-independence of the observations, we should account for House, Release.number and trapid. Since House only has three levels, we can add it in as a block (fixed) effect. Release.number and trapid have 6 levels each, we can include them as crossed random effects. We also need to nest trapid within house because the traps were numbered within a house, and trap 1 at Mia doesn t relate to trap 1 at Lucas Bueno. We also need to include a nested effect of house nested within Release.number to group the 6 traps that were measured together for a given release date at a given house. Since the response is a count, we will use a Poisson distribution for a generalized linear mixed model. Possible language for a manuscript: I fit a generalized linear mixed model with a Poisson distribution and a log link, using the R Statistical Software (R Core Team 2018) and the lme4 package (Bates et. al 2015). Fixed effects were Treatment and House, and random effects were Release.number and trap nested within house. Post-hoc analyses were conducted with the package emmeans (Lenth 2018). 5

R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.r-project.org/. Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/ Russell Lenth (2018). emmeans: Estimated Marginal Means, aka Least-Squares Means. R package version 1.2.3. https://cran.r-project.org/package=emmeans We need to check this model for overdispersion. We used the function provided here: http://bbolker.github. io/mixedmodels-misc/glmmfaq.html There did not appear to be strong overdispersion, so we can continue to intepret model. chisq ratio rdf p 86.87046098 1.27750678 68.00000000 6119733 Interpretation Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) [glmermod] Family: poisson ( log ) Formula: Marked.albo.fem ~ House + Treatment + (1 Release.number) + (1 Release.number:House) + (1 House:trapID) Data: markagg AIC BIC loglik deviance df.resid 200.8 224.4-90.4 180.8 68 Scaled residuals: Min 1Q Median 3Q Max -1.5732-0.7091-0.4009 0.7947 2.7689 Random effects: Groups Name Variance Std.Dev. House:trapID (Intercept) 0 0 Release.number:House (Intercept) 0 0 Release.number (Intercept) 0 0 Number of obs: 78, groups: House:trapID, 18; Release.number:House, 13; Release.number, 5 Fixed effects: Estimate Std. Error z value Pr(> z ) (Intercept) -1.8281 0.4680-3.906 9.38e-05 *** HouseLucas Malo 0.7885 0.3114 2.532 11342 * HouseMia 0.1598 0.3348 0.477 0.633091 Treatment20_aegypti 1.7918 0.4714 3.801 00144 *** Treatment20_albopictus 1.9459 0.4629 4.204 2.63e-05 *** Treatment100_aegypti 0.9808 0.5401 1.816 69348. Treatment100_albopictus 1.7346 0.4749 3.653 00259 *** --- Signif. codes: 0 '***' 01 '**' 1 '*' 5 '.' 0.1 ' ' 1 Correlation of Fixed Effects: (Intr) HsLcsM HouseM Trtmnt20_g Trtmnt20_l Trtmnt100_g 6

HouseLucsMl -0.457 HouseMia -0.425 0.639 Trtmnt20_gy -0.755 00 00 Trtmnt20_lb -0.769 00 00 0.764 Trtmnt100_g -0.659 00 00 0.655 0.667 Trtmnt100_l -0.750 00 00 0.745 0.758 0.650 We test the significance of the predictors using likelihood ratio tests: Single term deletions Model: Marked.albo.fem ~ House + Treatment + (1 Release.number) + (1 Release.number:House) + (1 House:trapID) Df AIC LRT Pr(Chi) <none> 200.78 House 2 203.63 6.8466 326 * Treatment 4 216.43 23.6454 9.407e-05 *** --- Signif. codes: 0 '***' 01 '**' 1 '*' 5 '.' 0.1 ' ' 1 There was a significant effect of Treatment on the number of mosquistos recaptured. There is also a significant effect of House, but since this is a variable that we are only controlling for, we will not interpret this any further. We continue with post-hoc analyses of which Treatments are significantly different from each other. Treatment rate SE df asymp.lcl asymp.ucl 0_none 0.2204632 9041833 Inf 9868102 0.4925365 20_aegypti 1.3227784 0.31607141 Inf 0.82812649 2.1128932 20_albopictus 1.5432415 0.34216223 Inf 0.99932878 2.3831940 100_aegypti 0.5879015 0.20913036 Inf 0.29276034 1.1805841 100_albopictus 1.2492907 0.30693610 Inf 0.77185065 2.0220587 Results are averaged over the levels of: House Confidence level used: 0.95 Intervals are back-transformed from the log scale The above table shows the expected (average) number of mosquitos collected for each treatment type, averaged over House, and accounting for slight Release date and trap differences. These values are shown in the below graph. The error bars shown are the asymptotic upper and lower confidence levels. They are not symmetric around the estimate due to the log-transformation used in the model. 7

2.5 2.0 1.5 rate 1.0 0.5 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus Treatment The below table shows the pairwise comparisons among these five groups. P-values have been adjusted with a Tukey correction for a family of 5 estimates. contrast ratio SE df z.ratio p.value 0_none / 20_aegypti 0.1666667 7856743 Inf -3.801 014 0_none / 20_albopictus 0.1428572 6613002 Inf -4.204 003 0_none / 100_aegypti 0.3750002 0.20252315 Inf -1.816 0.3641 0_none / 100_albopictus 0.1764707 8379850 Inf -3.653 024 20_aegypti / 20_albopictus 0.8571429 0.27532117 Inf -0.480 0.9892 20_aegypti / 100_aegypti 2.2500000 0.95606610 Inf 1.908 0.3127 20_aegypti / 100_albopictus 1.0588235 0.35809387 Inf 0.169 0.9998 20_albopictus / 100_aegypti 2.6250000 1.09062042 Inf 2.323 0.1376 20_albopictus / 100_albopictus 1.2352941 0.40302135 Inf 0.648 0.9671 100_aegypti / 100_albopictus 0.4705882 0.20176302 Inf -1.758 0.3983 Results are averaged over the levels of: House P value adjustment: tukey method for comparing a family of 5 estimates Tests are performed on the log scale These pairwise comparisons are summarized as a compact letter display in the below table. Groups that do not have the same letter label are significantly different at the alpha = 5 level. Here we see that though traps with 100 aegypti larvae do not have significantly more mosquitos captured than the controls, all three of the other treatments captured significantly more mosquitos than control. Treatment rate SE df asymp.lcl asymp.ucl.group 0_none 0.2204632 9041833 Inf 9868102 0.4925365 A 100_aegypti 0.5879015 0.20913036 Inf 0.29276034 1.1805841 AB 100_albopictus 1.2492907 0.30693610 Inf 0.77185065 2.0220587 B 8

20_aegypti 1.3227784 0.31607141 Inf 0.82812649 2.1128932 B 20_albopictus 1.5432415 0.34216223 Inf 0.99932878 2.3831940 B Results are averaged over the levels of: House Confidence level used: 0.95 Intervals are back-transformed from the log scale P value adjustment: tukey method for comparing a family of 5 estimates Tests are performed on the log scale significance level used: alpha = 5 We could also make specific contrasts, aimed at determining if there are differences by species or egg number. contrast ratio SE df z.ratio p.value aegypti.v.none 3.9999983 1.8408920 Inf 3.012 026 albopictus.v.none 6.2981453 2.7688726 Inf 4.186 <.0001 aegypti.v.albopictus 0.6351073 0.1701205 Inf -1.695 901 twenty.v.0 6.4807380 2.8431184 Inf 4.260 <.0001 hundred.v.0 3.8872996 1.7924725 Inf 2.944 032 twenty.v.100 1.6671568 0.4465663 Inf 1.908 564 Results are averaged over the levels of: House Tests are performed on the log scale Conclusion There was a significant effect of the 5-level treatment variable on the mean number of mosquitos recaptured (P < 001). Traps with 100 A. aegypti larvae do not have significantly more mosquitos captured than the control traps (P = 0.364 ), all three of the other treatments captured significantly more mosquitos than control (P < 05 for all). There were no differences in the number of mosquitos captured among these three treatments, however. 9

2.5 2.0 B 1.5 B B rate 1.0 AB 0.5 A 0_none 20_aegypti 20_albopictus 100_aegypti 100_albopictus Treatment 10