Evaluating DNA metabarcoding for beta diversity analysis: a study of Costa Rican dry forest arthropods and their associated bacteria

Size: px
Start display at page:

Download "Evaluating DNA metabarcoding for beta diversity analysis: a study of Costa Rican dry forest arthropods and their associated bacteria"

Transcription

1 Evaluating DNA metabarcoding for beta diversity analysis: a study of Costa Rican dry forest arthropods and their associated bacteria By Lisa Ledger A Thesis presented to The University of Guelph In partial fulfillment of requirements for the degree of Master of Science In Integrative Biology Guelph, Ontario, Canada Lisa Ledger, June, 2015

2 ABSTRACT: EVALUATING DNA METABARCODING FOR BETA DIVERSITY ANALYSIS: A STUDY OF COSTA RICAN DRY FOREST ARTHROPODS AND THEIR ASSOCIATED BACTERIA Lisa Ledger University of Guelph, 2015 Advisor: Professor M. Hajibabaei This thesis is an evaluation of DNA metabarcoding as a method for measuring beta diversity in both terrestrial arthropods and their associated bacterial groups. Malaise-trapped terrestrial arthropods were collected from three plots of tropical dry forest in the Area de Conservacion Guanacaste, Costa Rica with differing land use histories. Following a bulk DNA extraction, multiple primer sets for cytochrome C oxidase I (COI) and the 16S ribosomal subunit (16S) were used to amplify barcoding markers. The DNA was sequenced on the Illumina MiSeq platform and both taxonomically classified and clustered into operational taxonomic units (OTUs). Significant beta diversity can be observed with both taxonomic data and OTUs, for both arthropods and bacteria, with OTUs exceeding taxonomic identification at both capturing the available sequence data and capturing observed beta diversity. This work demonstrates the potential of DNA metabarcoding for biodiversity assessment of species rich groups such as tropical arthropods.

3 iii Table of Contents List of Tables:... vi List of Figures:... vii List of Abbreviations:... viii Acknowledgements:... x Chapter 1: Introduction Biodiversity: Definitions: Importance: Measuring Biodiversity:... 4 `1.3: DNA Barcoding: : Definition & Development Next Generation Sequencing : DNA Metabarcoding Hypotheses & Predictions: Research Goal: Research Questions: Hypotheses: Predictions: Chapter 2: Methods Study System: Tropical Dry Forests : Site Selection: : Target Organisms: Collection method: Malaise trapping : Sampling Protocol Barcoding Markers: COI and 16S DNA extraction: Initial tissue homogenization: Proteinase K digestion: Tissue Lysis: Spin filtration and DNA extraction: PCR primer optimization and amplicon generation Sequencing... 23

4 iv 2.5 Bioinformatics Pipeline Statistical analysis betadiver: : ADONIS: : betadisper: Chapter 3: Results : DNA Extraction: Nucleic Acid Concentration and Purity: Results of Sequencing and Bioinformatic Processing: : COI BE primer sets : 16S v3-v4 marker : 16S v6 marker Distribution and Characterization of COI Sequences: Taxonomy results using MEGAN : COI OTUs: Assignment of Orders: Distribution and Characterization of 16S sequences: : Richness and Distribution: Statistical Analysis with VEGAN: β-diversity of COI sequences: β-diversity of 16S sequences: Chapter 4: Discussion : DNA sequencing results: quantity and quality : Metabarcoding: OTUs and Taxonomic Identification: : Taxonomic Assignment : Statistical Analysis : Future Directions: : Ecological Analysis: : Additional Observations : Bioinformatics Chapter 5: Conclusions Chapter 6: Tables & Figures : Tables : Figures References:... 68

5 v Appendix I: COI barcode identifications by site for order, family, genus and species Appendix II: 16S barcode identifications by site for order, family and genus Appendix III: COI Taxonomy Sequence Matrices Appendix IV: 16S v3v4 Taxonomy Sequence Matrices Appendix V: Target Regions of Primer Sets Appendix VI: Replication & Quantification Across Experimental Stages Appendix VII: Temperature and Precipitation, Oct. 18 Nov. 1,

6 vi List of Tables: TABLE 1: MALAISE TRAP IDENTIFICATION, WITH GEOGRAPHIC LOCATION DATA TABLE 2: SPECTROPHOTOMETRY-DERIVED CONCENTRATION, ABSORBANCE VALUES AND ABSORBANCE RATIOS FOR MALAISE TRAP ETRACTS MEASURED BY NANODROP TABLE 3: SEQUENCE COUNTS FOR PAIRED AND QUALITY FILTERED COI SEQUENCES, BY TRAP AND PRIMER SET TABLE 4: SEQUENCE COUNTS PRE- AND POST-QUALITY FILTERING FOR 16S SEQUENCES, BY TRAP AND PRIMER SET. NB: V3-V4 PRE-PRINSEQ IS PAIRED ALIGNED SEQUENCE, V6 IS UNPAIRED REVERSE PRIMER TABLE 5: COUNT OF 95% SIMILARITY COI OTU CLUSTERS, BY ORDER, ACROSS SITES. ASTERISKS DENOTE ORDERS WITH SPECIES-LEVEL IDENTIFICATIONS AVAILABLE IN APPENDI I TABLE 6: SIGNIFICANCE (P-VALUES) AND PROPORTION OF ATTRIBUTABLE VARIATION (R 2 VALUES) FOR BETWEEN-SITE BETA DIVERSITY FOR COI AND 16S V3-V4 METABARCODED SEQUENCES, FOR ORDER, FAMILY, GENUS, SPECIES AND OTU PRESENCE-ABSENCE DATA

7 vii List of Figures: FIGURE 1: MAP OF AREA DE CONSERVACION GUANACASTE WITH BIOME DISTRIBUTION. DR. WALDY MEDINA, FIGURE 2: TOPOGRAPHIC MAP OF ACG FIELD SITES, WITH TRAP PLACEMENT: FIREBREAK (BLUE), SAN EMILIO (YELLOW) AND BOSQUE HUMEDO (GREEN) GOOGLE MAPS, FIGURE 3: ASSEMBLED MALAISE TRAP 1B, FIREBREAK SITE FIGURE 4: DISTRIBUTION OF GOOD LENGTH/QUALITY SEQUENCES FOR COI PRIMER SETS ACROSS MALAISE ETRACTS FIGURE 5: DISTRIBUTION OF GOOD LENGTH/GOOD QUALITY SEQUENCES ACROSS MALAISE ETRACTS FOR TWO 16S MARKER REGIONS FIGURE 6: PER-TRAP COUNT OF SEQUENCES ASSIGNED TO ARTHROPOD SPECIES AND OTUS VS TOTAL GLGQ SEQUENCE FOR COI METABARCODE DATA FIGURE 7: PER-TRAP COUNT OF IDENTIFIED ARTHROPOD SPECIES VS OTU CLUSTERS FOR COI METABARCODE DATA FIGURE 8: IDENTITY ASSIGNMENT FOR COI SEQUENCES AS 95% CLUSTERED COI OTUS FIGURE 9: PER-TRAP COUNT OF IDENTIFIED BACTERIAL GENERA VS OTU CLUSTERS FOR 16S V3-V4 METABARCODE DATA FIGURE 10: PER-TRAP COUNT OF SEQUENCES ASSIGNED TO BACTERIAL GENERA, OTUS AND TOTAL GLGQ SEQUENCE FOR 16S V3-V4 METABARCODE DATA FIGURE 11: PER-TRAP COUNT OF IDENTIFIED BACTERIAL GENERA VS OTU CLUSTERS FOR 16S V6 METABARCODE DATA FIGURE 12: PER-TRAP COUNT OF SEQUENCES ASSIGNED TO BACTERIAL GENERA AND OTUS VS TOTAL GLGQ SEQUENCE FOR 16S V6 METABARCODE DATA FIGURE 13: R 2 VALUE OF ADONIS TESTS OF BETWEEN-SITE BETA DIVERSITY, BY MARKER AND BY TAONOMIC LEVEL, CONDITIONED ON SITE. NOTE: NO SPECIES LEVEL DATA AVAILABLE FOR 16, COI BETA DIVERSITY NOT SIGNIFICANT AT ORDER LEVEL FIGURE 14: HOMOGENEITY OF DISPERSION FOR METABARCODED COI SEQUENCES BY SITE, FOR ORDER (A), FAMILY (B), GENUS (C), SPECIES (D) AND OTUS (E) FIGURE 15: PRINCIPLE COORDINATE ANALYSIS (PCOA) FOR METABARCODED COI SEQUENCES BY SITE, FOR ORDER (A), FAMILY (B), GENUS (C), SPECIES (D) AND OTUS (E) FIGURE 16: HOMOGENEITY OF DISPERSION FOR 16S V3-V4 BARCODED SEQUENCES BY SITE, FOR ORDER (A), FAMILY (B), GENUS (C) AND OTUS (D) FIGURE 17: PRINCIPLE COORDINATE ANALYSIS (PCOA) FOR 16S V3-V4 METABARCODED SEQUENCES BY SITE, FOR ORDER (A), FAMILY (B), GENUS (C) AND OTUS (D)

8 viii List of Abbreviations: 16S: 16S ribosomal subunit rdna ACG: Area de Conservacion Guanacaste ADONIS: Analysis of dissimilarity. Statistical test BEF: Biodiversity an ecosystem functions BES: Biodiversity and ecosystem services BH: Bosque Humedo, primary forest site bp: Base pair (Eg. A-T, G-C) BOLD: Barcode of Life Data System COI: Mitochondrial cytochrome c oxidase, subunit I dsdna: Double stranded DNA DNA: Deoxyribonucleic acid dntp: Deoxynucleotide EtOH: Ethanol FB: Firebreak, field edge secondary succession site GenBank: GenBank DNA sequence database GLGQ: Good length, good quality. A sequence that has passed PRINSEQ quality filtering. Mb: Megabase, one million bases of DNA MgCl2: Magnesium chloride mtdna: Mitochondrial DNA ng: Nanogram, SI unit 10-9 of 1 gram. NGS: Next-generation DNA sequencing OTU: Operational taxonomic unit PCR: Polymerase chain reaction rdna: Ribosomal DNA SE: San Emilio, 80+ year secondary forest site ssdna: Single stranded DNA Taq: Thermophilus aquaticus

9 ix TDF: Tropical dry forest µl: Microlitre, SI unit 10-6 of 1 litre v(#): Bacterial 16S rdna variable region number, eg. v3, v4, v6

10 x Acknowledgements: this project. I would like to thank Genome Canada and NSERC for providing the financial backing for Thank you, members of the Hajibabaei lab. Some faces have come and gone over the three years of my Masters research, but to Shadi Shokralla, Joel Gibson, Ian King, Liz Bent, Behnam Nikbakht, Rafal Dobosz, Stéphanie Boilard, Shannon Eagle, Jennifer (Spall) Bossey, Nicole Fahner, Mike Wright, Gina Capretta and Katie McGee, whether it s brainstorming solutions to bioinformatics problems, polishing presentations, or just side by side lab work with a background of Disney songs, your camaraderie and encouragement have remained a constant across my time. To my advisor, Dr. Mehrdad Hajibabaei, thank you for taking a chance on me. Your support and guidance have been a constant, in circumstances as diverse as long distance s after mild Costa Rican earthquakes or simply letting me monopolize your office hours for help clarifying my outlines. As a member of your lab, I have always felt encouraged to step forward and stretch myself to new goals and experiences I might otherwise have hesitated on. I would like to thank my advisory committee, Dr. Brian Husband and Dr. Manish Raizada, for their dedication and responsiveness. This has been a longer process than originally anticipated, and I'm so grateful that you've stuck with me through it all. The range of backgrounds you represent has brought forth new angles of analysis and consideration from every meeting. To Roger Blanco and the park staff of the Area de Conservacion Guanacaste, my thanks and my compliments on providing and preserving an unparalleled space for biodiversity

11 xi research. The ACG and the other national parks of Costa Rica offer a glimpse of a world where sustainability and development go hand in hand. I would like to thank Memo Guillermo Pereira, our parataxonomist, for his very literal guidance around the study sites and his education in local knowledge surrounding the plants and wildlife of the ACG. I would especially like to thank Dr. Dan Janzen and Dr. Winnie Hallwachs for their warm welcome and guidance, and above all for their instrumental role in shaping the ACG as we know it. I could not have accomplished this without the support of family and friends. Whether it s listening to me try time and again to condense my thesis topic into a concise description or letting me escape up to the cottage woodlot when things get too stressful, standing on the shoulders of giants is a turn of phrase normally reserved for discussing the role of past scientific research but it s just as true when speaking of the role of supportive loved ones. I also want to acknowledge the support of my co-workers and supervisors at Computing and Communications Services. Very few jobs allow you the flexibility to pursue full time graduate studies while working them, and I can t think of a single one that would combine it with the range of talented and engaging people I ve had the privilege to work with. Finally, I would like to thank Steven Blewett. You followed me into the dry forest as my boyfriend, and now I'm exiting the Masters process with you as my husband. I can't imagine taking this journey without your love, support, and occasional reminders to eat.

12 Chapter 1: Introduction 1.1 Biodiversity: Definitions: In 1992, on the heels of a decade that witnessed growing concern for the state of global biological diversity and the environment, (Soule & Wilcox, 1980; Wilson, 1988) the United Nations Convention on Biological Diversity set down the following definition of biodiversity: Biological diversity means the variability among living organisms from all sources including, inter alia, terrestrial, marine and other aquatic ecosystems and the ecological complexes of which they are part; this includes diversity within species, between species and of ecosystems (UN CBD, 1992). Beyond the categories outlined by the Convention, biodiversity can be further considered in terms of species richness, and species evenness (Gaston & Spicer, 2004). Species richness refers to the total number of species observed at a site, while species evenness is the similarity in relative abundance among species: a site will display high richness if there are a large number of species present, but will have low evenness if the individuals of a few species vastly outnumber those of others (Gaston & Spicer, 2004). Richness is a relative concept, with general-case gradients of low to high richness observed based on the interaction of factors such as latitude (lower richness as latitude increases), elevation (richness decreases as altitude increases) or aridity (low in desert biomes, high in rainforests) and thus 'high biodiversity' with respect to species richness should be considered in the context of the region of study -- a sample site located in a high elevation desert may display high richness relative to other high desert sites, but will have extremely low species richness compared to an equatorial rainforest (Gaston & Spicer, 2004). Biodiversity also has a geographic context, with Whittaker's system of alpha, beta and gamma level diversity a commonly used concept. Alpha diversity is the diversity observed at a 1

13 single site or within a single community, beta diversity is the diversity observed between sites, and gamma diversity the diversity observed across all sites within a defined region (Whittaker, 1970). While Whittaker acknowledged some confusion surrounding these terms, and has proposed that local (alpha), landscape (beta) and macro-scale (gamma) diversity are more informative descriptors, alpha, beta and gamma diversity remain in common use (Whittaker et al, 2001) Importance: Biological diversity within an ecosystem has ecological and practical significance through its influence on ecosystem function and ecosystem services (Cardinale et al, 2012). Ecosystem functions are those aspects of biodiversity that affect the functioning of an ecosystem, such as resource capture, biomass production, decomposition and resource recycling (Pimental et al, 1997; Cardinale et al, 2012). Ecosystem services are those aspects of an ecosystem that are of benefit (service) to humanity. These can take the form of direct economic benefits, such as ecotourism or the harvest of wild species for food, or more indirect services, such as soil formation and nitrogen fixation, habitat for agricultural pollinator species and biological pest control, or genetic resources allowing for the intensification of agriculture. Industry may benefit from carbon sequestration, or from the ability of a functional ecosystem to provide bioremediation of contaminated soils or groundwater (Pimental et al, 1997; Cardinale et al, 2012). Ecosystem services can also take the form of mitigating the effects of climate disturbance on inhabited spaces, such as that seen with the 'sponge effect' of intact coastal mangroves or other wetland ecosystems on storm surge or seasonal flooding (Hoang et al, 1997; Mitsch & Gosselink, 2000; Crooks et al, 2011). 2

14 In a meta-analysis of over 1,700 comparative studies on the impacts of biodiversity loss on humanity, loss or alteration of native biodiversity levels negatively impacted both ecosystem functions and ecosystem services, but the scale on which biodiversity was measured differed between the two aspects of biodiversity (Cardinale et al, 2012). Variations in ecosystem function were more often linked to variation in genetic diversity, whereas changes in ecosystem service were reported on higher levels, often as the presence or absence of entire functional groups (Hoang et al, 1997; Flynn et al, 2011; Cardinale et al, 2012). The magnitude of biodiversity has also been linked to ecosystem stability. Stability, the ability of an ecosystem to withstand external disruptive pressures, is characterized as resilience, (the ability of an ecosystem to return to a prior state following perturbation, i.e. a grassland's regrowth following a wildfire), or as resistance (the ability to resist perturbation directly, i.e. a fire unable to sustain itself in mature tropical forest) (Peterson et al, 1998; Thompson et al, 2009). It has been hypothesized that a stable ecosystem can better mitigate the effects of localized anthropogenic disturbances or the more distributed effects of global climate change across a range of ecosystems, both terrestrial and aquatic (Peterson et al, 1998; Thompson et al, 2009; Elmqvist et al, 2003; Cardinale et al, 2012). The degree to which the magnitude of biodiversity influences stability has been a topic of considerable debate, with an initial model suggesting a clear correlation between high biodiversity and high stability (MacArthur, 1955). Later critiques of the MacArthur model challenged its real world applicability, given the assumptions of equilibrium required by the model and that equilibrium states where no selection is occurring are rare in nature (Goodman, 1975; Kimmerer, 1984). A more recent review article suggests that the strength of the link between diversity and stability is influenced by environmental drivers affecting selection 3

15 pressures and other non-equilibrium processes, and thus a link between high biodiversity and high stability should not be generally assumed (Ives & Carpenter, 2007). Sustaining biodiversity must take place through conscious human intervention, either through protection of intact spaces or as restoration of anthropogenically impacted ones (Janzen, 1998). This can be both successful and cost-effective, and it has been argued that it must be economical if it is to succeed (Janzen, 1986; Janzen, 1998; Janzen, 2000; Peace & Moran, 2004; Naidoo & Adamowicz, 2005). Research into African bird diversity at forest sites has shown that it is more economical to preserve existing biodiverse ecosystems than to restore degraded ones, but in addition to this data coming from a single ecosystem type, it is not viable for many of the earth's ecosystems as they are simply too degraded or fragmented by previous human activity to be preserved without intervention (Janzen, 1988b; Janzen, 1999; Naidoo & Adamowicz, 2005; Willis, 2009). The relationship between high biodiversity and the provision of ecosystem services indicates that economic and political, as well as scientific, value can be found in maintaining native biodiversity levels in an ecosystem, and a mix of restoration and preservation may be the only real option for many ecosystems. Effective conservation of a threatened space must include both a knowledge of the extent and distribution of its original biodiversity, and the ability to monitor the ecosystem for changes in biodiversity: cost-effective methods that can rapidly assess a wide range of the beta diversity present within a landscape of mixed history or undergoing rapid change are thus of great value to conservation efforts Measuring Biodiversity: There are multiple approaches to measuring biodiversity, which can be placed along two axes: scale of coverage and time cost. 4

16 One approach to obtaining large-scale taxonomic knowledge of a particular geographic region is through a species inventory, where taxonomists collect and classify a large number of species, although this generally remains focused within a target group of organisms. A notable example can be seen in the recent publication of a study of arthropod diversity in a tropical forest (Basset et al, 2012). Over the course of 10 years, researchers from 31 institutions were able to collect and taxonomically classify specimens of 6,144 arthropod species located within a single 0.48 hectare plot (Basset et al, 2012). This paper can be considered a gold standard for a morphologically-based approach to cataloguing biodiversity, but the timeframe and manpower required to produce it makes this depth of coverage impractical for many conservation efforts. Species inventory need not always be limited to morphology-based methods, and many researchers now pair morphological identification with molecular approaches. An excellent example of both a wide range of species coverage, a lengthy period of study, and the incorporation of modern molecular advances can be seen in the work of D.H. Janzen and his associates in their multi-decade endeavours to catalogue and interpret the biodiversity of Costa Rican forest systems. Beginning in the 1960s with uncovering previously unobserved mutualisms between ants and host tree species, examining the seasonal rhythms of dry forests, and a range of other classic studies of tropical ecology, Janzen's body of work in Costa Rica has produced a robust profile of biodiversity within a threatened tropical ecosystem (Selection: Janzen, 1966; Janzen, 1967; Janzen & Schoener, 1968, Janzen, 1970; Janzen, 1975; Janzen, 1988a;). With the advent of DNA barcoding, research from the Area de Conservacion Guanacaste in particular combines continued morphological investigation with the use of molecular approaches to resolve cryptic species or other taxonomic challenges (Selection: Janzen et al, 2005; Hebert et al, 2004b; Smith et al, 2007; Janzen et al, 2009; Brown et al, 2014). 5

17 In ecosystems where an abundance of prior research has been conducted, it is possible to measure biodiversity through the use of meta-analysis studies, where an overall view of diversity can be generated by assembling a collection of smaller biodiversity studies focused on a range of species, a common location, or both (Dunn, 2004; Rey Benayas et al, 2009). Meta-analyses are limited by the quality and coverage of the studies they pool, and thus their decreased time-cost and increased statistical power is often constrained by the tendency for certain species groups or locations to be overrepresented while others are absent (Dunn, 2004). Biomonitoring, the use of biological data to monitor environmental change in ecosystems, must necessarily minimize its time cost in order to produce data in a timely fashion. Biomonitoring has also traditionally relied on morphological identification, with a small pool of indicator species used to minimize the processing time required (Dufrêne & Legendre, 1997). In terrestrial environments, amphibians, birds, arthropods and soil organisms have all seen common use as bioindicator species, as well as many other organisms. (Temple & Wiens, 1989; McGeoch, 1998; Welsh & Ollivier, 1998; Ettema & Yeates, 2003). With traditional biomonitoring approaches, the indicator species are collected from a monitoring site, or series of sites depending on the scope of the project, taxonomically categorized, and enumerated (Dufrêne & Legendre, 1997; Baird & Hajibabaei, 2012). Morphological identification of biomonitoring samples is a highly labour-intensive process, where even small-scale local sampling translates into many months of identification and enumeration, and where the data produced is often of low taxonomic precision (Baird & Hajibabaei, 2012). While morphological approaches to biodiversity assessment provide detailed answers regarding the diversity present in an ecosystem, the long processing time and high degree of training required for precise identification make them unsuitable for conservation efforts 6

18 requiring rapid response, or of limited resources. DNA barcoding, an approach to identification that decreases the processing time required for identification by using molecular markers instead of morphological keys, has been successfully used for biomonitoring using benthic invertebrates (Hajibabaei et al, 2011; Baird & Hajibabaei, 2012). Metabarcoding, the use of DNA barcoding on mixed environmental samples rather than selected target species, has also been applied to measuring benthic invertebrate diversity across multiple sample sites, and has been used to assess the diversity present within a single mixed sample of terrestrial arthropods (Hajibabaei et al, 2012; Gibson et al, 2014). DNA barcoding, and metabarcoding in particular, offer the opportunity to combine depth of coverage with greatly decreased processing time. `1.3: DNA Barcoding: 1.3.1: Definition & Development DNA barcoding is an approach to taxonomic identification that has its roots in molecular phylogenetics: the use of heritable molecular information to establish phylogenetic relationships between organisms (Grauer & Li, 2001; Hajibabaei et al, 2007b). By using sequence information from a small genomic region (e.g. partial sequence of a mitochondrial gene) that is both nearuniversal and with a rate of divergence that is in step with the rate of speciation, organisms can be assigned a species based on their genetic 'barcode' for the marker region: the pattern of nucleotide substitutions unique to that species (Hebert et al, 2003; Kress & Erickson, 2012). Using distance-based methods, DNA barcoding allows previously-characterized species to be rapidly identified and new species to be discovered and placed within existing taxonomy (Hebert et al, 2003). DNA barcoding also offers the benefit of resolving cryptic species, or life stages where diagnostic morphological characteristics are lacking (Hebert et al, 2003; Hebert et 7

19 al, 2004b; Hajibabaei et al. 2007). Since 2003, DNA barcoding protocols have been developed and agreed upon by consensus for animals, plants, protists and fungi, and while bacteria remain a challenge to identify to the species level, the 16S rdna gene has seen a resurgence in popularity for use in taxonomic identification when paired with barcoding protocols (Hebert et al, 2004a; Tringe et al, 2008; Hollingsworth et al, 2009; Schoch et al, 2012; Kress & Erickson, 2012). Although taxonomic databases such as the Barcode of Life Data Systems (BOLD) will continue to increase in depth of coverage over time, research has expanded from assembling a reference DNA barcode library to include the use of DNA barcoding to explore a wide range of questions with linkage to species identification: Consumer oriented studies investigating the authenticity of supermarket fish or assessing the contents of natural health products combine with ecological surveys, biodiversity assessments, or the retrieval of DNA barcodes from museum specimens (Hajibabaei et al, 2006b; Hajibabaei et al, 2007b; Wong & Hanner, 2008; Francis et al, 2010; Wallace et al, 2012). An early and persistent criticism challenges the claim that DNA barcoding could prove more cost-effective than traditional taxonomy, and will instead siphon away funding that could be applied to other projects (Will & Rubinoff, 2004; Rubinoff, 2006). More recently, there are concerns that the advent of next-generation sequencing technologies (NGS) and their potential for affordable multi-gene sequencing will render relying on single gene data obsolete when compared to the depth of sequencing produced in a single NGS sequencing run (Taylor & Harris, 2012). While the question of what field is the most worthy of funding is one not likely to be settled soon, NGS is hardly the demise of barcoding, and instead offers a major technological advance that reduces both the time and cost of sample prep, the per-base cost of sequencing and 8

20 allows for more cost effective expansion beyond single sequences into the realm of sequencing entire environments or communities (Hajibabaei, 2012; Joly et al, 2014). 1.4 Next Generation Sequencing Next generation sequencing (NGS) or High Throughput sequencing (HTS) approaches are DNA sequencing technologies that have been developed in the last decade to increase throughput and reduce costs associated with genomics analysis. The sequencing platform used in my research, the Illumina MiSeq, follows the sequencing by synthesis approach. In sequencing by synthesis, the sequencer uses adapter oligonucleotides (oligos) annealed to PCR amplicons during a second round of PCR amplification to attach the amplicons to a flow cell, where fluorescently tagged nucleotides are washed through during DNA synthesis (Voelkerding et al, 2009). Clusters of both the forward and reverse ssdna are created on the flow cell using bridging reactions following the initial attachment of the amplicons to the flow cell (Voelkerding et al, 2009). Once clusters are formed, dsdna is synthesized from the tagged nucleotides washed through the flow cell on each sequencing cycle. Laser excitation following each base incorporation causes the tagged base to fluoresce, with the wavelength of light emitted identifying the base incorporated (Voelkerding et al, 2009). An image of this is captured and digitized, and the combined series of images taken during a sequencing run produces the raw sequence data from the run, which can total 25 million reads following initial filtering, with a current maximum read length of 2 x 300 bp (600 cycles of sequencing) and a raw error rate as low as 0.4% or less (Voelkerding et al, 2009; Quail et al, 2012; Illumina, 2015). It should be noted that the sequencing performed in this research uses an earlier version of the MiSeq chemistry, producing a 2 x 250 bp maximum read length (500 cycles of sequencing) (Illumina, 2015). 9

21 Compared to the Sanger chain termination method of sequencing, NGS offers several advantages. The ability to process multiple DNA samples in parallel rather than a single sample at a time provides both a significant decrease in the cost per megabase (Mb, or million bases) sequenced ($0.50/Mb for the MiSeq vs. $2400/Mb for a modern Sanger system) and in the time required for sequencing; a single run of the Illumina MiSeq produces millions of sequence reads with slightly over a day's runtime (Liu et al, 2012; Quail et al, 2012). That it is designed to process sequences in parallel means that mixed samples can be successfully sequenced with the MiSeq, and in fact it performs better with a certain degree of sequence diversity present in the mix (Quail et al, 2012; Illumina, 2015). The massive output of data from NGS platforms such as MiSeq presents bioinformatic challenges, but also provides stronger statistical power as a result of the increased dataset (Coissac et al, 2012). Although its error rate is one of the lowest for current NGS platforms, there are known issues with homopolymer tracts of greater than twenty bases causing a local increase in missed base calls and the read lengths are smaller than the 1000 bases generally considered standard for Sanger systems, although this gap is closing rapidly (Quail et al, 2012). Additionally, while the MiSeq is largely error-free in GC rich, AT moderate or neutral sequences, highly A-T rich genomes, such as that of Plasmodium falciparum, have been shown to have up to 30% of the genome missed by sequencing (Quail et al, 2012). Finally, when PCR products are used as templates for sequencing (e.g. in typical DNA barcoding protocols), the requirement for PCR amplification introduces primer biases, and thus complicates the question of whether abundance data can be generated from MiSeq data (Liu et al, 2012; Quail et al, 2012). 10

22 1.5: DNA Metabarcoding DNA metabarcoding uses NGS to gather sequence information from DNA present in bulk samples. Involving as it does the identification of multiple taxa from within a single sample of environmental DNA, DNA metabarcoding is closely related to the concepts of DNA barcoding. However, because it is often difficult to obtain species-level identifications from complex environmental samples (e.g. due to lack of a reference DNA barcode library for all organisms in an environmental sample) the umbrella term of 'metasystematics' (Hajibabaei, 2012) has been used. In other words, with metabarcoding, the use of barcode regions and databases to obtain species-level identification is both desirable and useful, but the biodiversity information contained within operational taxonomic units (OTUs) or in higher level taxonomic assignments is not discarded (Hajibabaei, 2012). DNA barcoding has great potential for rapidly assessing biodiversity, either as part of ongoing biomonitoring initiatives or as one-off assessments of vulnerable ecosystems or species groups (Shokralla et al, 2012; Taberlet et al, 2012; Cristecu, 2014). Arthropods, already a favoured source of biological indicator species, have been used to assess the success of metabarcoding at identifying specimens from mixed samples, both terrestrial and aquatic, and from both artificial mixes and field-collected bulk samples (McGeoch, 1998; Hajibabaei et al, 2011; Yu et al, 2012; Gibson et al, 2014). Results have so far been favourable: Using only one primer set, 74% of 23 species of benthic invertebrate were recovered from an assembly using NGS techniques, and 76% of arthropods were identified from a homogenized mixture of seven multi-species arthropod communities (Hajibabaei et al, 2011; Yu et al, 2012). When multiple primer sets were used, the percentage recovered using NGS jumped to 83.5% of Sangeridentified sequences recovered from a Malaise trap homogenate (plus additional species undetected by Sanger sequencing), and 87% of morphologically identified benthic invertebrates 11

23 in a sample where DNA was extracted from the ethanol preservative holding the specimens (Hajibabaei et al, 2012; Gibson et al, 2014). The improved recovery rates through the use of multiple primer sets support the recommendation that metabarcoding should not rely on a single universal primer if it hopes to capture true species diversity (Deagle et al, 2014). In each of these cases, metabarcoding has been performed using either a pre-assembled array of specimens, or in tandem with hand-sorting, morphologically identifying and Sanger sequencing in order to verify the success of the method. In the case of field samples, metabarcoding has not only recovered a high percentage of Sanger-identified sequences, it has also recovered sequences that were undetected by this method, as well as sequences for which no taxonomic record yet exists. With this groundwork in place, the question may now be asked: Can metabarcoding alone successfully identify the biodiversity present within a field-collected sample? Further, can metabarcoding be expanded from a single bulk environmental sample to multiple samples taken from multiple sites? If so, can it successfully identify patterns in beta diversity between these sites? 12

24 1.6 Hypotheses & Predictions: Research Goal: To evaluate whether DNA metabarcoding is an appropriate and effective method for measuring beta diversity in both terrestrial arthropods and their associated bacterial groups. Research Questions: Does DNA meta-barcoding generate sequence information of a quantity and quality suitable for presence-absence based statistical analysis of beta diversity for both terrestrial arthropods and their associated bacterial groups? Can the sequences produced be categorized using both operational taxonomic units (OTUs) and taxonomic identification? How do OTUs perform in comparison with identified taxa with regards to utilizing available sequence data? How do OTUs perform in comparison with identified taxa with regards to measuring beta diversity? Hypotheses: Different land use histories will generate significant beta diversity between three plots of tropical dry forest within a 2 km 2 range. The use of metabarcoding to obtain presence-absence data for terrestrial arthropod DNA barcode lineages will permit measurement of within-site and between-site beta diversity. The use of metabarcoding to obtain presence-absence data regarding arthropodassociated bacterial diversity from 16S ribosomal markers will permit measurement of within-site and between-site beta diversity. Predictions: Due to the different land use histories of the sites, when assessed as presence/absence data using a dissimilarity matrix, the metabarcoded Malaise trap contents will reveal statistically significant beta diversity between the three plots of tropical dry forest. Where species-level or genus-level identifications for metabarcoded terrestrial arthropods and associated bacteria are unavailable, respectively, the use of OTUs clustered at a given threshold (98% for arthropods, 97% for bacteria) will allow the OTUs to function as species- or genus-level proxies for purposes of beta diversity assessment. 13

25 Chapter 2: Methods 2.1 Study System: Tropical Dry Forests Representing an estimated 42% of original tropical and subtropical forest range (The exact figure is unclear due to historical and ongoing anthropogenic disturbance), tropical dry forest is present in both old world and neotropical regions (Murphy & Lugo, 1986). In the Americas, satellite-based estimates circa 2010 place TDF coverage at over 500,000 square kilometres of land, ranging from Mexico, across the Central American isthmus and into South America to Brazil (Portillo-Quintero & Sanchez-Azofeifa, 2010). Tropical dry forests are primarily deciduous forests that experience a high degree of seasonality in their ecosystem processes, shifting between a May-November wet season and a December-April dry season (Janzen, 1988a; Boinski & Fowler, 1989). This seasonality is responsible both for the lower overall richness of TDF compared to tropical rain forests (although this is highly species dependent -- 80% of mammal species may overlap between TDF and rain forest) and for the increased complexity in community interactions (Janzen, 1988a). During the dry season, upwards of 80% of canopy cover is lost, rainfall is minimal and fauna species adapt to low water conditions using a number of strategies: some migrate to other forest biomes, such as the cloud forests at higher altitudes (Janzen, 1986). Others retreat to damp refugia within fallen logs, along river valleys or on north-facing slopes where some cover and moisture is retained (Janzen 1986). The wet season sees a spike in biodiversity as there is an explosion of arthropod and vertebrate species, taking advantage of the rapid restoration of the forest canopy. Seasonal rivers and streams reflood, as daily rainfall periods deliver the bulk of the annual rainfall (between 1-3m) in the course of this period (Janzen, 1986; Boinski & Fowler, 1989). 14

26 The TDF retained existed largely in increasingly fragmented patches of questionable long-term ecological viability (Janzen, 1988a). In 1988, only 0.09% of the original range of TDF (480 km 2 ) was located within protected spaces (Janzen, 1988a). Accidental and intentionally-set fires encroached on the protected spaces that did exist, further eroding the remaining TDF, as the heat generated by a grassland fire is sufficient to clear all small above-ground woody structures and to damage any trees it does not directly consume (Janzen, 1988a; 1988b). Through a combination of socioeconomic initiatives, active intervention to prevent wildfires, managed cattle grazing, and the native resilience of the forest type in terms of wind and wildlife distributed seeds, the current outlook for TDF is much improved (Calvo-Avarado, 2009). Satellite imagery and forest cover maps of the Guanacaste region of Costa Rica compiled since 1960 show a rapid loss of forest peaking in 1979, followed by a recovery over the following decades to levels exceeding those of This cover lacks the density of the 1960 stands, as the bulk of it is secondary succession (Arroyo Mora et al, 2005a; b). The Area de Conservacion Guanacaste (ACG), where field samples were collected for this project, is one of the largest and earliest-established locations of protected TDF in Central America, and continues to serve as a test bed for developing techniques of TDF restoration and management as well as a study site for numerous aspects of biodiversity assessment (Janzen, 1986b; Allen, 2001; Janzen et al, 2005; Hajibabaei et al, 2006a). The current size of the terrestrial portion of the conservation area is 1,470 km 2, including tropical dry forest, rain forest, cloud forest and mangroves and allowing free movement of species across these biomes (Fig. 1) : Site Selection: After consultation with D. Janzen and R. Blanco, three sites within a 2 km 2 section of ACG Santa Rosa Sector were selected for sample collection (Fig. 2). All have been represented 15

27 in past species inventories of the flora and fauna of the ACG, including several papers on tropical forest regeneration (Burnham, 1997; Fedigan & Jack, 2001; Kalacska et al, 2004; Janzen et al, 2005). Bosque Humedo (BH) is a site of primary tropical dry forest that has been free of anthropogenic disturbance, outside of occasional small-scale selective timbering prior to the park s creation. Tree cores taken during previous studies document the presence of standing timber in excess of 460 years old (D. Janzen, pers. comm.). The canopy is closed, the understory open, and there are mature evergreen trees present, a rarity for neotropical dry forests that is considered indicative of the ancient state of neotropical dry forest (Burnham, 1997). The specimens used in previous work done by the Hajibabaei lab, of metabarcoding the contents of a single Malaise trap, were collected at Bosque Humedo (Gibson et al, 2014). Bosque San Emilio (SE), the site of the former San Emilio plantation, is an example of secondary tropical dry forest in late stage secondary succession. The trees are largely deciduous. Abandoned and allowed to regenerate naturally since the 1920s, there is a closed canopy and an open understory, typical features of late succession tropical dry forest (Kalacska et al, 2004). Soil cores taken for future study revealed the presence of small fragments of fired clay brick, but no visible structures remain. The site is sloped, with traps placed at the base of the hill near a game trail. The Firebreak site (FB) is a plot of 35 year-old secondary tropical dry forest in early secondary succession at the edge of a grassland. Managed by annual burning to produce a protective firebreak, the grassland consists of a monoculture of invasive jaragua (Hyparrhenia rufa) grass typical of unprotected spaces outside the park borders, as jaragua was initially imported for use in cattle pastures. The deciduous forest is likewise typical of early regeneration, 16

28 with a semi-closed canopy and a dense understory, and contained a large number of young thorn acacia trees with resident ant colonies (Kalacska et al, 2004). Traps were placed within the understory, 2 m back from the edge of a circular outgrowth of the field. Although the primary focus of this study is to evaluate the metabarcoding methodology, these three sites offer the opportunity to examine the potential effects of habitat restoration on beta diversity. The ACG s mix of secondary succession forest of varying ages with a few small stands of primary forest interspersed is a realistic example of land use within parks and preserves constructed from previously cleared land. The regrown dry forest stands of Santa Rosa, in particular, follow the boundaries of old farms and ranches, allowing for an array of successional stages within a single 2 km 2 zone of the sector. Because the sampling sites are all located within the same 2 km 2 region, this both increases the challenge to metabarcoding to successfully differentiate between sites and increases the likelihood that between-site differences will be due to differences in successional stage, as they are all located within the same ecosystem. Selecting primary forest, late stage secondary forest and early stage secondary forest bordering a monoculture grass pasture as sampling sites allows the comparison of a closed canopy and an open canopy secondary forest to the primary forest, in terms of their arthropod and associated prokaryote biodiversity. This sets the stage for considering questions of ecology, should the technical questions regarding NGS and beta diversity assessment be answered definitively. Likewise, the endangered nature of the tropical dry forest provides added value for biodiversity data obtained from it : Target Organisms: The organisms targeted by this study were neotropical terrestrial arthropods and their associated bacterial groups. Arthropods are a diverse group of organisms that are ubiquitous 17

29 across terrestrial biomes. They have been a rich source of bio-indicator species, and their small size allows for ease of collection, storage and transport of large numbers of individuals (McGeoch, 1998). Also as a source of environmental DNA, terrestrial arthropods are vectors for the directly-associated DNA of gut contents, both in terms of diet and their microbiome, as well as bacteria that are present in the environment but not directly associated with insect hosts (Gibson et al, 2014). A large number of studies involving the terrestrial arthropods of the ACG are currently documented in scientific literature (Selection: Janzen et al, 2005; Hajibabaei et al, 2006a; Smith et al, 2006; Smith et al, 2007; Bertrand et al, 2014; Gibson et al, 2014; Shokralla et al, 2014) Collection method: Malaise trapping As terrestrial arthropods were to be the primary organisms targeted, I selected Townesstyle Malaise insect traps as the collection method (Townes, 1972). Malaise traps address a number of challenges and requirements presented by both the study location and the sequencing method chosen: The traps are lightweight, portable, easy to assemble, resistant to normal weather conditions and require only the addition of ethanol to the trap heads to function (Townes, 1972; Noyes, 1989). Unlike pitfall traps or mist nets, they do not require frequent monitoring (Noyes, 1989), and given insect activity levels in the ACG during the time of the study, required only weekly servicing to swap out filled collection jars for empty ones with fresh ethanol. Based on consultation with local experts, time of year was the primary determiner of length for the collection run: given insect population numbers and activity at the end of the rainy season, a two week run was considered a suitable length of time to amass enough variety of insect biomass to present a representative sample of the species present. Detailed temperature and precipitation records from the collection period are recorded in Appendix VII. 18

30 The collected samples were shipped within the ethanol solution they were collected in, with no refrigeration required, as the specimens were kept intact until processing. The disturbance of the traps surroundings was minimal, and their design and colouration (Fig. 3) allowed them to intercept normal insect behaviour such as light-seeking or the use of flight paths for travel through the forest understory (Townes, 1972). Malaise traps will bias the specimens collected towards winged insects, although placement against the ground will allow some ground dwelling species access, and the 2 cm mesh used at the collecting head may prevent larger species of arthropod from entering the trap (Noyes, 1989) : Sampling Protocol A flow chart outlining the major steps of sampling, extraction, sequencing and bioinformatic processing is available in Appendix VI. 9 Townsend-style (BugDorm II) traps were purchased and transported unopened in their storage bags to the site to reduce the potential for contamination. GPS coordinate were recorded for each location, accurate to within +/- 2 m (Table 1). The collecting heads were filled with 95% ethanol, and new catch jars were swapped in 7 days into the 14 day collection run. (October 18 th November 1, 2012). With export and import permits obtained from the Costa Rican and Canadian governments, the 18 sealed jars were transported to the University of Guelph by commercial courier, arriving in early January of 2013, and were stored at -20 C until ready for DNA extraction Barcoding Markers: COI and 16S Following the recommendations of Gibson et al, I used multiple primer sets for both COI and 16S to maximize sequence recovery from the trap contents (Gibson et al, 2014). Three mini- 19

31 barcode primer sets, which have shown previous success in amplifying Malaise-derived DNA (BF/ER, F4/R5 and F10/R3), were used to target a 335 bp region of COI (Gibson et al, 2014; Shokralla et al, 2014). Two primer sets were used to generate amplicons for the 16S subunit rdna gene (16S), providing coverage of variable regions three through four (v3v4) and four through six respectively (v4v6), with sequence overlap in the v4 region. The use of the two 16S markers to target bacterial genera, when paired with COI primers targeting terrestrial arthropod DNA, allowed a single round of sampling to be used to measure the beta diversity observed in the two most diverse groups of eukaryotic and prokaryotic organisms (Gibson et al, 2014). 2.2 DNA extraction: I processed each trap s contents individually over the course of seven weeks, with all non-single use equipment decontaminated with Eliminase wash and a 30 minute UV sterilization between each step. The contents of the week 1 and week 2 catch jars were pooled for each trap and a 50 ml sample of ethanol drawn off, labelled and stored at -20 C as a voucher Initial tissue homogenization: I produced a crude homogenate by blending the insects and ethanol using a 12 speed consumer-model blender that had been previously decontaminated and sterilized. This homogenate was transferred to 50 ml Falcon tubes and centrifuged at 2000 rpm for 2 minutes to sediment the tissue. Excess ethanol was drawn off and the tubes re-centrifuged, the process repeated until the sediment formed a pellet. The pellets were incubated at 70 C, with shaking, to evaporate the remaining EtOH. Once dry, the homogenate pellets for each trap were combined into a single tube, mixed, and stored at -20 C. Each trap yielded sufficient dry mass to allow further homogenization and DNA extraction, and to retain excess dry mass as voucher material. 20

32 Using a decontaminated scoopula, I subsampled each trap s homogenate into 3 separate extraction tubes per trap, with ~1 ml of dried homogenate added to each tube. The remaining dry mass in the Falcon tubes was retained as vouchers and stored at -20 C. DNA was extracted using a Nucleospin Tissue DNA extraction kit (Macherey-Nagel) and a minor modification of the kit protocol: Proteinase K digestion: The crude homogenate was first rehydrated with 720 µl of the kit s T1 buffer and then further homogenized with an MP FastPrep tissue homogenizer for 40s at 6 m/s. Following this second homogenization step, the tubes were spun down in a microcentrifuge and 100 µl of proteinase K was added to each to digest proteins in preparation for tissue lysis. After vortexing to ensure even distribution of the proteinase K, the tubes were incubated 56 C for 16 hours, with shaking, as the proteinase digest occurred Tissue Lysis: Once the digest was completed, work surfaces were once again decontaminated, and a 1 ml aliquot of molecular-grade ddh2o placed in a heating block to warm to 95 C. The exteriors of the tubes were wiped with Eliminase. The tubes of digest were centrifuged for 3 minutes at 11,000 g and 205 µl of supernatant was transferred to each of three clean microfuge tubes per tube of digest (3 tubes digest 9 tubes of supernatant, per site). The samples were lysed by adding 200 µl of solution B3 from the kit and incubating at 70 C for 10 minutes. Following lysis, 210 µl of 95% EtOH were added and the tubes vortexed Spin filtration and DNA extraction: Collection tubes were assembled for each lysed solution, with a spin column placed in each. The supernatant was transferred and the columns centrifuged at 11,000 rpm for 1 minute. 21

33 The flow-through was discarded, and 500 µl of M-N BW buffer was washed through with a 1 minute centrifugation at 11,000 rpm. The flow-through was again discarded and the columns replaced in their collection tubes. A second wash with 600 µl of B5 solution, again at 11,000 g for 1 minute, was followed by transferring the filters, now with bound and washed DNA, to clean 1.5 ml tubes. DNA was eluted from the filters with 30 µl of warmed water and the 9 subsampled tubes pooled together. Purity and concentration of DNA for each site were determined using a NanoDrop spectrophotometer and recorded in Table 2. NanoDrop analysis confirmed that DNA of sufficient purity and an abundance of concentration for PCR amplification was obtained for each trap. Samples were kept at 4 C until PCR optimization and pre-sequencing amplification were completed, to minimize freeze-thaw cycling, and then stored at -20 C. 2.3 PCR primer optimization and amplicon generation. For arthropod DNA, two primer sets (ArF5/ArR5, ARF10/ArR3) and a thermocycler program were selected that had been developed and tested with previous Malaise trapped arthropods in the ACG, thus simplifying the optimization process (Gibson et al, 2014) along with a third primer set (BF/ER), amplifying the same region of COI, developed for environmental barcoding of benthic invertebrates (Hajibabaei et al, 2012). The three primer sets [B-F (5`.CCIGAYATRGCITTYCCICG.3`), E-R (5`.GTRATIGCICCIGCIARIAC.3`), ArF4 (5`.GCICCCGATATRGCITTYCCYCG.3`), ArR5 (5`. GTRATIGCICCIGCIARIACIGG. 3`) and ArF10 (5`.CCWGATATAKCITWYCCICG.3`), ArR3 (5`.GTRATWGCICCIGCTARWACWGG.3`)] targeted a 310 base segment in the middle of the standard COI barcode region. Prokaryote DNA was amplified by two primer sets targeting variable regions three and four [16Sv4F, TGCCAGCAGCCGCGGTAA; 16Sv6R, 22

34 ACGAGCTGACGACARCCATG] and variable regions four through six [16S v3f (5`. ACTCCTACGGGAGGCAGCAG-3`; 16S v4r (5`. GGACTACARGGTATCTAAT.3`] of the 16S ribosomal subunit gene. Diagrams of the marker regions and their target areas are available in Appendix V. In addition to full concentration DNA template, 1/10 th and 1/100 th dilutions were prepared for each trap s DNA extract. Each primer was tested at varying concentrations and amounts of template, and the results visually assessed via 1.5% agarose gel electrophoresis to determine amplification strength and quality, with quality defined as crispness and brightness of the visualized DNA bands, and an absence of non-specific amplification. The final components for each PCR mix (25 µl total volume) were as follows: 2 µl of 1/100 th dilution DNA template, to a final concentration of ~1 ng/µl, 17.5 µl H2O, 2.5 µl PCR buffer, 1 µl MgCl2 (50mM), 0.5 µl dntps (10 mm), 0.5 µl forward primer (10 mm) 0.5 µl reverse primer (10 mm), 0.5 µl Platinum Taq polymerase (5 units/ µl) (Invitrogen). For COI amplification, the thermocycling regime was as follows: initialization at 94 C for 5 minutes, denaturation at 94 C for 40 seconds, annealing at 46 C for 1 minute, extension at 72 C for 30 seconds. Following 30 cycles, a final extension at 72 C for two minutes, and then held at 10 C. For 16S amplification: initialization at 94 C for 2 minutes, denaturation at 94 C for 1 minute, annealing at 46 C for 30 seconds, extension at 72 C for 1 minute. Following 30 cycles, a final extension at 72 C for 5 minutes, and then held at 10 C. Negative controls with no DNA template were used during all PCR amplifications. 2.4 Sequencing Amplicons for both COI and 16S were purified with a QIAgen MiniElute column-based purification kit (QIAgen) and eluted in 50 µl molecular biology grade water before a second 10 23

35 cycle PCR amplification to attach Illumina adaptor tails prior to sequencing, as well as unique tags identifying primer set and trap ID, for future sequence recovery. Amplicons were then sequenced on the Illumina MiSeq platform, using the bp paired-end sequencing protocol developed for the MiSeq, and MiSeq 500 cycle reagent kits. Sequencing workflow followed manufacturer s protocols and was performed by the Hajibabaei lab genomics facility at the Biodiversity Institute of Ontario over two sequencing runs, with each marker region allotted half the coverage of its run. Given a total Illumina MiSeq output of 25 million available reads, five primer sets and nine samples, 1.38 million reads are available per marker region, per sample. These are further subdivided into 460,000 maximum reads per sample for each COI primer set and 690,000 maximum reads per sample for each 16S primer set. 2.5 Bioinformatics Pipeline Two.fastq files were generated for each trap s sequences: one forward and one reverse. PANDAseq (Masella et al, 2012) a paired end assembly program specifically for Illumina sequencing outputs, was used to align the paired ends and combine the two files into a single file per site, per primer. The minimum overlap used was 20 bases, and the first 25 bases in the forward and reverse directions were trimmed to remove the primer sequences. PRINSEQ Lite (Schmieder & Edwards, 2011) was then used to perform quality filtering of the newly aligned sequences. The minimum allowed sequence length was 130 bases, as previous publications by Hajibabaei and Meusnier demonstrated that a minimum length of 130 bp was sufficient to obtain 91-4% species identification when using mini-barcode primer sets (Hajibabaei et al, 2006b; Meusnier et al, 2008). The minimum Phred score was 20, representing 99% accuracy in base calling, with a scanning window size of 10 and a window step size of 5. 24

36 Chimera removal and clustering was done to 98% similarity for COI and to 97% for 16S sequences for each trap using the program USEARCH (Edgar, 2010). The COI sequences were searched against all publically available COI sequences in GenBank using megablast and the output exported into MEGAN 4 to generate a taxonomic hierarchy (McGinnis & Madden, 2004; Huson et al, 2007). For the 16S V3-V4 and V6 region sequences, the online classifier tool of RDPipeline, (Wang et al, 2007) a naïve Bayesian classifier, was used to assign all sequences a taxonomic identity and an associated confidence value for that assignment. Sequences that could not be assigned a domain with 100% confidence were discarded, as were sequences with less than 70% confidence in assignment of family. To assess the diversity of operational taxonomic units (OTUs) present, a master library was assembled by combining all COI fasta files from the 9 sites prior to clustering, as the same 335 base region was amplified by all three COI primers, and by creating separate master lists for the V3-V4 and V6 fragments of 16S. Once concatenated, the sequences were clustered to 98% similarity for COI and 97% similarity for the 16S fragments and were renamed to reflect the master library. The individual sequence files were then BLASTed against the master lists to create an assembly.csv file denoting the overall size of each cluster and its distribution across the trap contents. This process was repeated for 16S v3-v4 and the v6 fragment, again substituting RDPipeline s classifier for the BLAST alignment. While order-level identification for 16S sequences unidentified at the genus level could be obtained by revisiting the parameters for inclusion applied to the classifier output and identifying sequences where an order could be assigned with 90% confidence or better, COI proved more complicated. For the COI OTU sequences, the clusters produced through clustering at 98% similarity resulted in a computational bottleneck for the analysis, as the 25

37 program was unable to correctly partition taxonomic information about the clusters to their component sites without failing and generating blank files. To overcome this, the sequences were clustered at 95%, the highest percentage of similarity at which the program could run, and then BLASTed for an identity of 90% or better. The results were assembled in a.csv file delineating which clusters were present at each trap and the order they belonged to. 2.6 Statistical analysis After converting the results of taxonomic ID and OTU assembly into pivot tables using Microsoft Excel, sequence matrix tables were created and then exported in.csv format for all marker regions at the OTU, genus, family and order levels. To create files of a manageable size, OTUs or genera represented by < 100 total sequences across all sites were discarded. A second round of exclusion saw sequence counts of < 10 for COI and < 3 for 16S for individual traps replaced by a zero value, thus reducing the potential for Type I statistical errors. The difference in magnitude in COI GLGQ sequence count and 16S GLGQ sequence count is the reason for the differing minimum thresholds for the second round of exclusion. Statistical analysis of beta diversity for order, family, genus, species (COI only) and OTUs for COI and the two 16S marker regions was conducted in R, using the VEGAN package of biodiversity analysis tools (Okansen et al, 2007; R Core Team, 2013). Beta diversity scores for all traps and marker regions were calculated using the betadiver, which automatically converted the sequence count matrices into presence-absence data, and permutational multivariate ADONIS tests using the betadiver output were conducted with the adonis command, with permutations = 200, the program default. For each level of analysis, a 26

38 homogeneity of dispersal boxplot and a principle coordinate analysis plot were generated from a model built with the betadisper command (Oksanen et al, 2007) betadiver: Given a table with rows denoting sample and columns denoting the contents of the sample (trap and trap contents, for this thesis) betadiver returns a dist object, a matrix of pairwise diversity index calculations which can then be subjected to diversity analysis by the functions adonis and betadisper (Oksanen et al, 2007). The nature and parameters of the dist object are determined by the diversity index selected (Oksanen et al, 2007). 24 separate beta diversity indices are offered for presenceabsence data, drawn from the work by Koleff et al. to assess beta diversity indices found within the scientific literature against a common framework of components: a, the species shared between sites, b, the species absent from the focal site and c the species unique to the focal site (Koleff et al, 2003; Oksanen et al, 2007). For this study, I selected Cody's 1975 beta diversity index, (Simplified by Koleff et al. to b + c / 2) developed for measuring the change in species composition along an environmental gradient. In the case of my research, the progression of sites from primary forest through late secondary to early secondary succession serves as a gradient. Cody's beta diversity index is symmetric, in that neighbouring and focal sites (In this case individual traps) can be exchanged without altering the measure of diversity (Koleff et al, 2003). The scale for Cody runs from 0 (no dissimilarity) to 50 (no similarity). Should either b or c equal zero for a given pair, i.e. if the focal site contains all species present, or if it contains no species unique to it, a linear relationship is observed between the maximum and minimum values for the index (Koleff et al, 2003). It is recommended as a measure of both continuity and of gain and 27

39 loss, and has the advantage of being a broad-sense measure of beta diversity, ignoring the relative magnitude of the species present, which is particularly advantageous given the known impact of PCR primer bias on calculating abundance from sequence data (Koleff et al, 2003). The main flaw of the Cody method, that it is not suited to data involving transects, does not apply to this thesis research as no transects are used (Koleff et al, 2003). Betadiver can generate distance matrices, but statistical analysis relies on the use of additional VEGAN functions: adonis and betadisper : ADONIS: Adonis is a function that performs a permutational multivariate analysis of variance for metric and semi-metric distance matrices. Unlike ANOVA, (or its multivariate counterpart MANOVA) it tests significance based on sums of squares using permutations of the raw data, rather than of the residuals. This is better for datasets with a small number of samples, such as the nine Malaise traps of this research, that contain many columns of data (observed species, OTUs, etc.) (Oksanen et al, 2007). In addition to the significance test, if ADONIS is flagged to condition the samples based on a shared condition (Site, for my datasets) it will also generate an R 2 value, a correlation measure indicating how much of the dissimilarity present can be attributed to the condition (Oksanen et al, 2007). For my research, the R 2 value represents between-site beta diversity : betadisper: Betadisper produces a model where the distances contained in the betadiver dist object are reduced to the principle coordinates between objects (traps) and their group (Site) centroids, and are also plotted relative to an overall centroid (Oksanen et al, 2007). Visualized in a principle coordinate analysis (PCoA) plot using the plot() command on the betadisper model, the 28

40 PCoA plot displays the within-site beta diversity as points around group centroids, and the between-site beta diversity as the relationship between the groups (Oksanen et al, 2007). In a landscape with low between-site beta diversity, the non-dimensional space occupied by the groups will overlap, with the distance between the groups increasing with their dissimilarity to each other. Within-site beta diversity can be visualized from the PCoA plot by observing the distance between points within a group, with dissimilarity increasing as the points grow more distant. Another method for visualizing within-site beta diversity is through the use of a homogeneity of dispersal (HoD) plot, obtained using the boxplot() command on the betadisper model (Oksanen et al, 2007). HoD plots a confidence interval (Based on the Studentized Tukey 'Honest Significant Difference' method) for each group's (Site's) mean distance to the group centroid, and displays the groups side by side as a whiskered histogram plot (Oksanen et al, 2007). As with PCoA, the distance is Euclidean rather than a real unit of measure. 29

41 Chapter 3: Results 3.1: DNA Extraction: Nucleic Acid Concentration and Purity: Nucleic acid concentrations for the sub-sites ranged between ng/ml (2C) to ng/ml (3A). Average concentrations across sites are ng/ml for Site 1 (Firebreak), ng/ml for Site 2 (San Emilio) and ng/ml for Site 3 (Bosque Humedo) (Table 2, Fig. 2). The average A260/280 ratio across samples was 2.08, and the average A260/A230 ratio across samples was The ratio between absorbency at A260 and at A280 indicates that the samples do not contain pure DNA, as the A260/280 ratio of a pure sample of DNA should be ~1.80. This may indicate the presence of a significant percentage of RNA, as the A260/280 ratio for pure RNA is ~2.00. Phenols and other contaminants absorbing at A230 are another potential source of contamination, leading to an under-estimate of the A260/280 ratio when present, but the calculation of the A260/230 ratio allows for their detection and isolation. A noncontaminated sample should yield an A260/230 ratio of ~ , with lower readings indicating the presence of contaminants. With an average A260/230 ratio of 2.27, the DNA extracts are not notably contaminated with substances that absorb at A Results of Sequencing and Bioinformatic Processing: 3.2.1: COI BE primer sets 8,530,853 COI sequences were obtained after paired end alignment (Table 3). Following primer trimming and discarding sequences with a Phred score < 20, 8,069,380 sequences of good length and good quality remained (Table 3). In preparation for BLAST alignment and species level identification using MEGAN 4, clustering all GLGQ sequences at 98% with no exclusion based on size produced 3,750,649 clusters when processed by individual site (Table 3). 30

42 When first assembled as a master list, the generation of OTUs at 98% similarity from GLGQ sequences produced 5322 clusters of size 100, containing 7,354,940 sequences, or % of all GLGQ COI sequences. The large difference in cluster count between master list OTU clusters and individual site clustering, despite identical similarity thresholds, can be explained by site-based clustering treating identical or near-identical sequences at different sites as separate clusters, whereas the master list approach unifies them. Clustering at 95% similarity was used to identify operational taxonomic units to the order level (section 3.3.2), which resulted in 4,426 clusters of size 100 containing 7,440,217 sequences, or % of GLGQ COI sequences : 16S v3-v4 marker The initial 55,158 good length/good quality sequences for the 16S v3v4 fragment contained 19,164 sequences that could be classified by RDPipeline to 100% confidence in assignment at the domain rank and no less than 70% confidence at the family rank (Table 4). After removing 324 singleton and 98 doubleton sequences, 18,200 sequences remained (Fig. 4). The generation of OTUs at 97% similarity produced 68 clusters of size 100, containing 16,285 sequences before removal of site singletons and doubletons, and 16,239 after removal, or 29.44% of GLGQ sequences for 16S v3-v : 16S v6 marker Initial attempts at processing the 16S v4-v6 fragment resulted in an 83.9% average failure of aligned sequences when filtered using PRINSEQ, compared to an average failure of 8.5% for V3-V4 sequences. Investigation into this failure determined that the 550 bp v4-v6 region chosen as a marker exceeded the fragment size for which the 2x250 forward and reverse sequences 31

43 produced by the Illumina MiSeq could overlap and be aligned using PANDA, and were thus being discarded by PRINSEQ. By using only the reverse sequence, a ~150 base fragment that included the v6 region of 16S could be produced and successfully quality filtered/scored with PRINSEQ, although the failure rate remained high, with only 43,754 sequences remaining of the starting 216,509 after quality filtering. This is likely due to the smaller fragment size amplifying the effects of a single mismatch on the overall Phred score. 135 OTUs of 97% similarity and size 100 were produced, containing 24,073 sequences before removal of site singletons and doubletons, and 24,022 after removal, or 54.90% of GLGQ sequences for 16S v Distribution and Characterization of COI Sequences: Taxonomy results using MEGAN 4 Across all sites, a total of 229,145 sequences could be assigned taxonomic identities to the species level, distributed across 110 species, representing 7 orders, 23 families, and 57 genera (Appendix I). Four orders (Diptera, Entomobryomorpha, Hymenoptera and Lepidoptera) were common to all three sites. Two orders were unique to a given site: Hemiptera was found only at San Emilio (Site 2) and Mantodea was unique to the firebreak (Site 3). The seventh order, Cladocera, was present at San Emilio and the firebreak, and absent in Bosque Humedo. In addition to species-level identification, 49,524 sequences could be assigned to the level of genus, 95,478 could be assigned identities to the family level, and 797,299 could be identified to the order level. In total, of the 8,069,380 GLGQ sequences, 1,597,450, or %, could be given either a complete or partial taxonomic identity using a BLAST alignment to 98% identity 32

44 against reference library sequences, leaving 6,471,930 sequences unassigned, or % of all GLGQ COI sequences. Of the sequences that could be assigned an order, 58.62% (467,412 sequences) were from Bosque Humedo (Site 3) Malaise traps, 31.70% (252,774) were from the Firebreak (Site 1) and the remaining 9.68% (77,113) were from San Emilio (Site 2). For sequences identified to the family rank, 50.94% were from Bosque Humedo, 28.98% were from San Emilio and 20.08% were from Firebreak traps. For sequences assigned to a genus, distribution was even across the sites: 33.64% at San Emilio, 33.51% at Bosque Humedo and 32.85% at the firebreak. It should be noted that these percentages reflect relative sequence abundance only, and should not be taken as a measure of relative species abundance : COI OTUs: Assignment of Orders: From the 95% similarity OTUs, 19 orders of arthropod were identified (Table 5), inclusive of the 7 identified through MEGAN and species-level identification. 62% of clustered GLGQ sequences could be assigned an identity to the order level (Fig. 7), representing 4,633,746 sequences. 39,191 could be assigned a class. 2,757,280 remained unassigned. Although specieslevel identifications are possible only for seven orders, five orders are known to contain either entirely aquatic species (Cladocera), a mix of aquatic and terrestrial species (Amphipoda) or species with an aquatic juvenile phase (Ephemeroptera, Megaloptera, and Trichoptera). 3.4 Distribution and Characterization of 16S sequences: 3.4.1: Richness and Distribution: Combining the presence-absence data for both the V3-V4 and V6 fragments, there are 30 orders of bacteria represented in the Malaise capture across the three sites, containing representatives of 58 families and 132 genera (Appendix II). 33

45 29 of 30 orders were captured using the V3-V4 fragment, 12 of which were also detected using the V6 fragment. One order, Legionellales, representing a single family and genus, was detected only with the V6 fragment. This family (Coxiellaceae) is the only family detected by the V6 marker alone, with 19 of 57 detected by both markers. 10 genera are detected by the V6 fragment alone, and 37 of 57 are detected with both markers. 3.5 Statistical Analysis with VEGAN: β-diversity of COI sequences: As shown in the analysis of dissimilarity (ADONIS) data recorded in Table 5, significant (p-value 0.05) observed COI beta diversity between the three ACG sites is present for OTUs, and at the family, genus and species levels of taxonomy. It is not significant at order level. R 2 values for each level reveal that between site beta diversity, framed as the correlation between site and dissimilarity, is highest for the OTUs (R 2 = ) and lowest for genus level (R 2 = ). In Figure 14, patterns in homogeneity of dispersal (HoD) show similar trends at the OTU, genus and family levels, with the widest dispersion present at the Bosque Humedo (primary dry forest) for all three levels, representing a greater degree of within-site variability than at the firebreak (field edge secondary dry forest) and San Emilio (secondary dry forest) sites. Withinsite beta diversity can be roughly calculated in a HoD histogram by comparing the mean distances to centroid for each dataset: for COI, the means of all three sites differ at the OTU level, (Fig. 14, e)) but at all higher levels at least two sites show near-identical means: Bosque Humedo and the firebreak at the order level, and San Emilio and the firebreak at the family and genus level. 34

46 As principle coordinate analysis (PCoA) requires at least some within-site dissimilarity and San Emilio dataset contained 0 values at the order level, the order level PCoA failed to plot correctly (Fig. 15 a). At the species and OTU levels all sites are visibly distinct from each other, reflecting the significant between-site β-diversity. There is overlap between the Firebreak and San Emilio sites at the genus level. We can also observe within-site β -diversity with PCoA, represented by the size of the triangle created by the replicates for each site; there is high withinsite β -diversity present at the primary forest Bosque Humedo site at both the family and genus level, compared to the secondary San Emilio and Firebreak sites β-diversity of 16S sequences: ADONIS data for 16S v3-v4 sequences, recorded in Appendix II, demonstrates that significant observed 16S beta diversity is present between the three ACG sites for OTUs, and at the order, family and genus levels of taxonomy. R 2 values for each level reveal that between-site beta diversity is highest for the OTUs (R 2 = ) and lowest for genus level (R 2 = ). At the genus level, the mean distance to centroid for HoD (Fig. 16) is roughly equal across sites. As taxonomic rank increases, the means diverge from each other. Bosque Humedo remains closes to the centroid, while San Emilio moves the farthest. At all but order level, San Emilio displays the largest deviation from the mean. At order level, the largest deviation from mean belongs to the Firebreak site. When viewed as PCoA plots (Fig. 17), all sites are visibly distinct, with a small overlap between Bosque Humedo and San Emilio at the family level. The large area covered by each site's replicates indicates a large degree of within-site diversity, particularly when compared to COI. 35

47 Chapter 4: Discussion 4.1: DNA sequencing results: quantity and quality The quality and quantity of DNA sequence data produced through next generation sequencing protocols, and its relevance to addressing biological questions such as beta diversity is influenced by a number of experimental design factors. Distal factors may include the study site selected, or the collection method chosen. More proximate factors involve the DNA extraction protocols used, the design of PCR primers and PCR programs used for amplification, the choice of sequencing platform and reagent chemistry, and the depth of coverage allotted to the samples. After sequencing, the bioinformatic processes employed can have a profound effect on the nature of the data produced, and its utility for statistical analysis. Comparing Table 2, the nucleic acid concentration for each trap's extract, to Figs. 4 and 5, the total number of good length and good quality (GLGQ) sequences produced for each marker region, there is no evidence to suggest that the number of sequences produced was affected by the nucleic acid concentration of the initial DNA extracts: The two lowest-concentration extracts (2A and 2C) yield the second and third highest sequence counts for COI, but produce moderate sequence counts for both 16S primers, while the first and fourth highest COI sequence counts (3B and 3A) correspond to moderate concentrations of DNA. The normalizing effects of the 1/100th dilution used as PCR template may have contributed to this outcome. Figures 4 and 5 also allow an assessment of relative primer performance across sites. For the three COI primer sets (B3/ER, F4/R5 and F10R3), the overall count of GLGQ sequence per site varies, but is evenly distributed between the three primer sets, indicating that all COI primers successfully amplified their target DNA and performed equally well at all sites (Fig. 4). The performance of the 16S primers is mixed: while 16S v3-v4 is fairly consistent across traps, the 36

48 v6 primer underperforms on the Bosque Humedo traps (3A, 3B, 3C) when compared to v3-v4, outperforms v3-v4 on two of the Firebreak traps (1A, 1C) and is relatively equal in its performance on San Emilio trap extracts. It is unclear whether this variation in performance is due to the quality of the bacterial DNA present, its affinity for the v6 primer or a combination of other factors. In terms of comparing COI primer to 16S primer performance, paired COI sequences from all sites and primer combinations produced 8.53 million reads prior to quality filtering, or 68% of the maximum available 12.5 million reads for a half-run of the Illumina MiSeq (Table 3). 16S sequences from all site and primer combinations produced 424,008 unfiltered reads, using 3.4% of the available 12.5 million reads (Table 4). While the 20-fold difference in sequence reads between the COI and 16S sequence reads suggests that further optimization of 16S amplicon preparation may be recommended for future studies, sequencing of both COI and 16S primers generated quantities of sequence reads suitable for further bioinformatic processing and analysis. Information on sequence quality is obtained from the results of PRINSEQ quality filtering, where 94.59% of COI sequences passed quality filtering, as did 91.45% of the 16S v3-v4 fragment. As the ~550 bp length of the full v4-v6 fragment was unable to be aligned due to a gap between the forward and reverse primer sequences, the reverse primer sequence, containing the v6 region of 16S, was quality filtered individually, without paired alignment. Only 11.86% of the 16S v6 fragment sequences passed quality filtering, despite outperforming v3f/v4r in terms of raw sequences prior to quality filtering (Table 4). This may be due to the small size (~150 bases average size) of the V6 fragment amplifying the effects of an ambiguously called base on quality scoring. For primer sets that could be paired (COI B/E, F4/R5 f10r3 and 16S v3f-v4r) the Illumina MiSeq produced sequence data of high quality. 37

49 4.2: Metabarcoding: OTUs and Taxonomic Identification: 4.2.2: Taxonomic Assignment : Cytochrome C oxidase I Detailed in Table 5 (for OTU clusters) and Appendix I (for species-level taxonomy), all sites have clusters that are assigned to arthropod orders that are either aquatic or feature an aquatic juvenile stage. The timing of collection for the end of the rainy season ensures both the presence of fresh water in the form of vernal pools and seasonal streams, but also of the adult forms of aquatic juveniles. Purely aquatic orders, such as the Diplostraca (one confirmed genus, Kurzia, with a likely mis-classification at the species level, of the North American species Kurzia media) must reasonably appear in Malaise captures as gut contents. As the Ephemeroptera, Megaloptera and Trichoptera all feature an aquatic juvenile stage followed by a short-lived adult stage, these may be likely sources for the purely aquatic Kurzia DNA. Order Amphipoda, likewise ubiquitous across sites, is ambiguous without additional taxonomic information: some amphipods are aquatic, some are terrestrial detritivores. The highest richness of clusters overall was found among those currently unable to be assigned an identity at the order level. This suggests that deeper taxonomic study of the area may prove beneficial. Earlier metabarcoding studies of species-rich samples reveal the same pattern (Yu et al, 2012; Gibson et al. 2014). Although my sampling was not designed to provide comprehensive analysis of taxonomic groups in sites sampled, some interesting observations were possible from data analysed. Bosque Humedo (BH), the primary forest site, contains both the largest number of orders but also the greatest observed diversity of clusters for the Araneae, Coleoptera, Ephemeroptera, Hemiptera, Hymenoptera, Neuroptera, Orthoptera and Psocoptera, and is the only site with Embioptera and Megaloptera clusters present. It contains only a single cluster of Diplostraca. 38

50 The former San Emilio plantation (SE), while its 80+ year regeneration appears visually similar to the primary forest site, is the lowest in both order richness, and also in the cluster richness within its orders. It has the lowest diversity of Collembola, Diptera and Lepidoptera of the three sites, although it has the highest variation in Diplostraca, with 21 observed clusters to the 9 of the Firebreak and the single one of Bosque Humedo. It also contains the highest number of unassigned clusters, and was the only site where order Blattodea was observed. The field edge site at the firebreak (FB) held the midpoint in terms of number of orders, but it provided the richest orders of Collembola, Diptera, Lepidoptera and Trichoptera, as well as the only traps to contain Mantodea DNA. The field edge traps also contained the highest number of unique species, concentrated in the Lepidoptera. Based on the limited sampling I have completed, I cannot determine whether this high degree of species diversity is due to the field edge's presence within an ecotone. That 19 orders of terrestrial arthropod were detected in trap contents, but only 7 contained sequences that could be identified to the species level suggests that targeting the under-represented orders with additional taxonomic inventory paired with barcode identification could prove a fruitful area of future taxonomic research. In total, 110 species-level identifications were made using COI metabarcode data, distributed across 7 orders, 23 families and 57 genera (Appendix I). Previous studies of tropical arthropod diversity using Malaise capture can give a sense of the diversity of tropical forest arthropods obtained using a morphological approach to taxonomic identification: 56 families of Diptera were identified from 7 rainforest sites across Australia and Papua New Guinea using a combination of canopy fogging, pan traps and 3 Malaise traps placed for 9 catch-days, but species level identifications were not made (Kitching et al, 2004). A 10 catch-day run of 39

51 Malaise traps in Sulawesi, Indonesia captured 28 families of Hymenoptera, but again, specieslevel identification was not made (Noyes, 1989). Considering neotropical forests, a study of Costa Rican ant species were able to identify 165 species of ant, in a study specifically designed to focus on ant diversity, but did so using multiple 14 catch-day collection runs of 7 Malaise traps, obtaining 62 separate samples across a 13 month period (Longino et al, 2002). Contrasted with these studies, the performance of species-level identification in this project is mixed -- it consistently provides species-level identifications, which many morphologically-based studies cannot do, but it is not able to provide partial taxonomic classification to sequences for which no species-level record yet exists, and the number of families identified across all observed orders is lower than those discovered while focusing on a single one. As species-level identification accounts for only 19.21% of available GLGQ COI sequences, this leaves the majority of captured arthropod diversity unclassified. COI OTUs outperform species-level taxonomic identification with regards to capturing available sequence diversity. When clustered at the species-like similarity of 98%, OTU clusters of size n > 100 account for 91.15% of available GLGQ sequence for COI. When clustered at 95% to obtain order-level taxonomic information, 62% of available GLGQ sequence could be assigned an order, with 19 orders detected, inclusive of the 7 containing species-level identifications : 16S v3-v4 & v6 Expanding on the single trap used in Gibson et al. (2014), 16S bacterial analysis was conducted from total DNA obtained in Malaise traps as an approach to estimate both arthropods diversity and their bacterial microbiome in one sampling effort. Sequence identification for 16S was done using RDPipeline's classifier tool, which assigned a full taxonomic identity to each 40

52 sequence, but pairs each taxonomic level with a confidence rating for that assignment. As a result, a classification using the master list of OTUs was not conducted, as order data for all sequences could be obtained from the initial classifier run and thus determining the orders present for sequences that failed to make the cut off for inclusion in the list genera was a matter of revisiting the original classifier output prior to the application of cut offs. This does mean that the OTUs are not linked directly to an order for 16S.16S v6 contributed only one order, Legionelles, not present in the v3-v4 data, but provided support for 12 more, so the presence-absence data for the two primers was combined in a single presenceabsence table (Appendix II). Of the 23 bacterial orders observed, 17 were ubiquitous across sites, 12 common to two. 4 orders are common to the Firebreak and San Emilio (Desulfovibrionales, Fusobacteriales, Neisseriales and Pasteurellales), 3 are common to the Firebreak and Bosque Humedo (Legionellales, Oceanospirillales and Opitutales) and 2 (Erysipelotrichales and Rhodobacterales) are common to San Emilio and Bosque Humedo. Orders Aeromonadales and Chlamydiales are unique to the firebreak, Bifidobacteriales is unique to San Emilio and Bacillales was unique to Bosque Humedo, the primary forest. Among the common orders, Actinomycetales contains the largest number of families (8), while Enterobacteriales, despite containing only a single family (Enterobacteriaciae) contains the largest number of individual genera (23). When considering the diversity of genera, it is important to bear in mind that the classifier results were obtained with each site treated separately, rather than as a master list. As a result, some of the diversity within genera-rich orders such as the Enterobacteriales may be an artifact of classification: if two or more genera were equally likely for a sequence, it may have been assigned a different genera depending on the site it was clustered at. Clustering as a master list 41

53 with site data retained and then running the entire master file through classifier would clarify this. For the 16S v3-v4 fragment, the number of GLGQ sequences assigned a taxonomic ID exceeds the number of sequences clustered into OTUs for 5 out of the 9 Malaise traps (Fig.12). This apparent discrepancy clustering at 97% should produce OTUs that include the taxonomically identified genera, as well as unidentified sequences -- may be explained by the filtering step where only OTUs of cluster size 100 were retained. While restricting clusters to those of 100 sequences served to restrict the number of COI clusters produced to a manageable size for bioinformatic processing and statistical analysis, the much smaller starting pool of GLGQ 16S sequences (98,912, vs 8,293,292 for COI) suggests that a less restrictive filter may have been more appropriate. The 16S v6 fragment sequences with taxonomic identities do not outnumber the v6 OTUs, which may be explained by the small number of taxonomic identities that could be assigned to the short (~150 bp) sequence fragments produced by the v6 primer: there are so few taxonomic identities that the n 100 cut off for cluster size is still less harsh than the alignment criteria required for a taxonomic ID. When determining an appropriate cut-off for less rich NGS outputs, a stepwise approach minimizing the number of GLGQ sequences lost at varying filtering stringencies may be beneficial: from the original 100 cluster size, test at 50, 25, 10, or simply dispense with discarding any clusters and conduct only the second-stage filtering of removing singleton and doubleton sequences on a per-sample basis. One should consider the effect of clustering thresholds: the higher the percentage identity required to cluster, the larger the number of clusters produced, but the smaller the average cluster size will be. The size of the fragment 42

54 being clustered will also influence the results here, as the effects of a miscalled base will be amplified as the fragment size decreases. The 16S v3-v4 sequence data retained following filtering for 100 cluster size remains enough to successfully identify patterns of beta diversity within the metabarcoded 16S sequences, but determining an appropriate cut off for 16S cluster size to best minimize the number of discarded sequences without introducing Type I errors will require further analysis. 4.3: Statistical Analysis The R 2 values for between-site beta diversity show a general trend of increasing as the level of taxonomic rank decreases (Fig 13). As the R 2 variable in an ADONIS table reflects the proportion of dissimilarity that can be attributed to a given variable, in this case site, this is an expected result. The trend can be explained by the hierarchical nature of taxonomic classification: the 116 identified species belong to only 7 orders for COI, with the 132 identified genera for 16S belonging to 30 orders. As the number of common variables between sites within a presence-absence matrix increases, the proportion of dissimilarity that can be explained by site alone decreases. That this trend of increasing R 2 as taxonomic rank decreases is not universal, as seen with the R 2 for COI at the family level (Fig. 13) exceeding that of genus, indicates another constraint when working with presence-absence taxonomy data: the matrix tables depend on taxonomic information being available for the organisms identified at the taxonomic level the matrix is generated for. As there are 95,478 of sequences for which the taxonomic hierarchy is unclear below the family level, this increased the dissimilarity between sites at the family level relative to genus and species, and thus resulted in a higher R 2 value. OTU matrices, being generated directly from sequence data, are not constrained by available taxonomic records and thus provide 43

55 a closer approximation of the dissimilarity between the three sites, as captured in Malaise trap contents. The decisions to discard singleton and doubleton sequences at each trap for the 16S matrices, and of discarding sequence counts of < 10 per trap for the COI matrices before their conversion into presence-absence tables by VEGAN are a compromise between Type I and Type II statistical errors. By choosing to discard low- or no-cluster sequences, we choose to discard sequences that may have arisen to miscalled bases during sequencing and thus minimize Type I error. However, as it is impossible to prove whether the low- or no-cluster sequences are miscalls or are instead from locally rare or poorly-amplified species, it is conceivable that some genuinely present sequences may have been discarded. When this occurs for clusters for which high sequence counts are present in traps at one site, but their presence at others fall below the threshold for inclusion, the exclusion results in a report of no presence at a given site. This may thus potentially enhance dissimilarity artificially, and produce a Type II error. 4.4: Future Directions: 4.4.1: Ecological Analysis: Although the inherent biases resulting from PCR amplification used in data generation mean that species abundance cannot yet be reliably measured with metabarcoding (Aird et al, 2011; Shokralla et al, 2012), the presence-absence data generated by this study can still be used to investigate several areas of ecological interest. First and foremost, the presences and absences of arthropods and bacteria at sites can be correlated to each other, with observed patterns potentially serving as either internal validation, if the correlated organisms are identifiable, or as a means of generating future hypotheses if the correlation exists only with OTU data. 44

56 For metabarcoding sequences where a taxonomic and ecological context exists, and where suitable markers have been employed to capture available DNA, it may be possible for future research to investigate food webs, host-pollinator interactions or predator-prey interactions using arthropods, their gut contents and accessory environmental DNA (Sheppard et al, 2004; Jousselin et al, 2008; Hrcek et al, 2011; Clare, 2014). Where interactions are uncharacterized or unseen, it may also be possible to class identified species by their functional guild: pollinator species (Apis mellifera), parasitic species (Subfamilies of Ichneumonidae such as the Campopleginae), gut bacteria such as the Enterobacteriales and a range of other species with well-characterized ecological roles were present among the metabarcoded results. Co-occurrence, the mapping of two or more species present or absent from the same sites, is a classic, if historically contentious, use of presence-absence matrix tables of the sort easily generated from metabarcoding data. Making use of the development of mathematical methods accommodating the reality that any sample will fail to capture all the diversity within an area, software such as EcoSim R offers a series of statistical tests for ecological data analysis, including co-occurrence measures (MacKenzie et al, 2004; Gotelli & Ellison, 2013). Because cooccurrence null model analysis is a statistical test, it can be performed using fully identified species or OTUs, and does not require a priori knowledge of their ecological context. Either as part of a longitudinal study tracking appearances or disappearances over time or in single snapshots of an ecosystem, beta diversity data from metabarcoding could be used as a potential measure of ecosystem health. Particularly when conducted with proper sampling design and given a framework such as functional guilds, co-occurrence or other measures of community structure, an increase or decrease in beta diversity can signal meaningful change in an ecosystem. What an increase or decrease represents will vary depending on the nature of the ecosystem 45

57 under study, as well as the source of the dissimilarity, and would thus require careful critical analysis. Nevertheless, the opportunity to create assemblages of species or OTUs offers the potential for a more realistic representation of an ecosystem s functioning and can complement the use of a small number of bio indicator species investigated through conventional approaches : Additional Observations While the study as designed was suitable for determining the viability of metabarcoding as a method for assessing beta diversity, the scope could be expanded for future work in several directions: By including additional COI primer sets, either more general or primers targeted to classes of organism beyond arthropods, the presence of non-arthropod animal species could be detected in gut contents or in associated environmental DNA. The use of additional marker regions, such as ITS, rbcl or matk, could allow the identification of associated fungal or plant species; the presence of Apis mellifera DNA, a pollinator, and of Rhizobiales bacterial DNA, an order hosting nitrogen-fixing plant symbionts, suggests that the inclusion of plant barcode primers could be a promising avenue of future investigation. As ample DNA extract remains in long-term storage at -80 C, this expansion could be done without additional fieldwork. Clearly, additional sequencing from available genetic material is an important advantage of genomicsbased biodiversity analysis (e.g. Hajibabaei et al, 2007b) and can be performed more feasibly as the cost of sequencing decreases. At the time of Malaise trap assembly, 10 ml soil cores were taken in triplicate from the litter-cleared soil around each trap. These samples have been stored in long-term freezer space as well, but a study of soil biota beta diversity could be contrasted with the above-ground diversity of the terrestrial arthropods. 46

58 In addition to research involving previously-collected material, a repeat experiment could include additional sites, such as a site in the heart of the grasslands, or a comparison of in-park field edge versus field edges beyond the park borders. A study of changes in beta diversity across various points in the rainy and dry seasons is another avenue to consider, as are tests of other collection methods, such as leaf litter sampling or canopy fogging : Bioinformatics The large datasets created by metabarcoding, and from NGS in particular, present ongoing bioinformatic challenges. Development of robust tools to perform resource-intensive tasks such as the BLAST search assignment of OTU master files is needed, as well as to streamline and standardize retaining and processing information about the site distribution of these OTUs. Additionally, clustering tools such as USEARCH have limitations and development of more robust tools for initial clustering of NGS data is an active area of research. With proper development of bioinformatic tools and pipelines, it may be possible to create a 'one stop shopping' software environment for metabarcode sequences, where they are assembled as a master list of OTUs, clustered to the same percentage used for species-level identification, BLASTed at 90% or better alignment, and the top hit for each cluster, along with its distribution across sites, output as taxonomic classifications and as matrix data. The resulting output would contain all the species-level IDs, by selecting only those hits with an alignment score of 97% or better, and would also have placed those OTU clusters without an identified species within an order where possible. Thus, with a single bioinformatic tool, information on species diversity and sequence diversity from multiple sites would be standardized and be available for further analysis. 47

59 Chapter 5: Conclusions Taxonomic identification of COI sequences identified 123 arthropod species, distributed across 7 orders, 23 families and 56 genera. Taxonomic identification of 16S sequences identified 132 genera, distributed across 57 families and 30 orders. This suggests that DNA metabarcoding can successfully generate species-level identifications for arthropods, and genus-level identifications for bacteria 19.21% of quality filtered COI sequences could be assigned a species-level identity, and 19.35% of quality filtered 16S sequences could be assigned a genus-level identity % of filtered COI sequences and 40.81% of filtered 16S sequences could be clustered as OTUs, with unclustered sequences excluded from further analysis. 62% of COI OTUs could be assigned a partial taxonomic identity to the order level, identifying 19 orders, inclusive of the 7 orders with identified species. A significant amount of observable biodiversity in the Malaise contents is overlooked by species/genus level identification, possibly due to lack of sequence coverage in reference libraries. However, analysis of sequence diversity through OTUs coupled by high level annotation of OTUs in taxonomic groups could provide an overall estimate of biodiversity in species rich ecosystems such as tropical dry forests. Species-level identification of terrestrial arthropods, genus-level identification of bacteria and the generation of operational taxonomic units (OTUs) for both arthropods and their associated bacteria resulted in statistically significant (p < 0.05) beta diversity between sites. For bacteria, between-site beta diversity accounts for 62.2% of observed dissimilarity for OTUs, and 41.1% for genera. For terrestrial arthropods, between-site beta diversity accounts for 73.1% of observed OTU dissimilarity and 55.1% of species dissimilarity. Based on these observations, 48

60 beta diversity between sites is the primary driver of dissimilarity in metabarcoded Malaise trap contents for both terrestrial arthropods and their associated bacterial groups. Metabarcoding the contents of Malaise traps is an appropriate method for measuring beta diversity in both terrestrial arthropods and their associated bacterial groups. 49

61 Chapter 6: Tables & Figures 6.1: Tables Table 1: Malaise trap identification, with geographic location data. Site: Trap Name: GPS Coordinates Elevation: (m above sea lvl) (degrees) Firebreak 1A N W m Firebreak 1B N W m Firebreak 1C N W m San Emilio 2A N W m San Emilio 2B N W m San Emilio 2C N W m Bosque Humedo 3A N W m Bosque Humedo 3B N W m Bosque Humedo 3C N W m 50

62 Table 2: Spectrophotometry-derived concentration, absorbance values and absorbance ratios for Malaise trap extracts measured by NanoDrop. Trap ng/ml A 230 A 260 A / /230 1A B C A B C A B C

63 Table 3: Sequence counts for paired and quality filtered COI sequences, by trap and primer set. Primer Trap Good Length/Quality Aligned Paired Sequences % GLGQ COI-BE 1A B C A B C A B C COI-F10R3 1A B C A B C A B C COI-F4R5 1A B C A B C A B C Totals:

64 Table 4: Sequence counts pre- and post-quality filtering for 16S sequences, by trap and primer set. NB: v3-v4 pre-prinseq is paired aligned sequence, v6 is unpaired reverse primer. Primer Trap GLGQ Pre-PRINSEQ* % GLGQ 16S v3-v4 1A B C A B C A B C S v6 1A B C A B C A B C Totals:

65 Table 5: Count of 95% similarity COI OTU clusters, by order, across sites. Asterisks denote orders with species-level identifications available in Appendix I. Order Firebrea k San Emilio Bosque Humedo Species Represented Total Freshwater shrimp & terrestrial detritivores Amphipoda Araneae Spiders Blattodea Cockroaches Coleoptera Beetles *Collembola Springtails *Diplostraca Freshwater crustaceans *Diptera True Flies Embioptera Webspinners Ephemeroptera Mayflies *Hemiptera True Bugs *Hymenoptera Ants, wasps & bees *Lepidoptera Butterflies & Moths *Mantodea Mantids Mecoptera Scorpionflies Megaloptera Alder, Dobson & Fish Flies Neuroptera Lacewings Orthoptera Grasshoppers & Crickets Psocoptera Barklice & Flies Trichoptera Caddisflies Unassigned

66 Table 6: Significance (p-values) and proportion of attributable variation (R 2 values) for between-site beta diversity for COI and 16S v3-v4 metabarcoded sequences, for order, family, genus, species and OTU presence-absence data. COI 16S v3-v4 R 2 R 2 p-value value p-value value order family genus species OTUs

67 6.2: Figures Figure 1: Map of Area de Conservacion Guanacaste with biome distribution. Dr. Waldy Medina,

68 Figure 2: Topographic map of ACG field sites, with trap placement: Firebreak (blue), San Emilio (yellow) and Bosque Humedo (green) Google Maps,

69 Sequence Count Figure 3: Assembled Malaise trap 1B, Firebreak site A 1B 1C 2A 2B 2C 3A 3B 3C Trap CO1-BE CO1-F10R3 CO1-F4R5 Figure 4: Distribution of Good Length/Quality Sequences for COI Primer Sets across Malaise extracts. 58

70 Sequence Count Sequence Count A 1B 1C 2A 2B 2C 3A 3B 3C Trap 16S v3-v4 16S v6 Figure 5: Distribution of Good Length/Good Quality sequences across Malaise extracts for two 16S marker regions A 1B 1C 2A 2B 2C 3A 3B 3C Trap species sequences OTU sequences GLGQ sequences Figure 6: Per-Trap count of sequences assigned to arthropod species and OTUs vs total GLGQ sequence for COI metabarcode data. 59

71 A 1B 1C 2A 2B 2C 3A 3B 3C Trap # species # OTUs Figure 7: Per-Trap count of identified arthropod species vs OTU clusters for COI metabarcode data. 37% 62% 1% Assigned order or lower Assigned class Unassigned Figure 8: Identity Assignment for COI sequences as 95% clustered COI OTUs 60

72 Sequences A 1B 1C 2A 2B 2C 3A 3B 3C Trap # genera # OTUs Figure 9: Per-Trap Count of identified bacterial genera vs OTU clusters for 16S v3-v4 metabarcode data A 1B 1C 2A 2B 2C 3A 3B 3C Trap genera sequences OTU sequences GLGQ sequences Figure 10: Per-Trap Count of Sequences assigned to bacterial genera, OTUs and total GLGQ sequence for 16s v3-v4 metabarcode data. 61

73 Sequences A 1B 1C 2A 2B 2C 3A 3B 3C Trap # genera # OTUs Figure 11: Per-Trap Count of identified bacterial genera vs OTU clusters for 16S v6 metabarcode data A 1B 1C 2A 2B 2C 3A 3B 3C Trap genera sequences OTU sequences GLGQ sequences Figure 12: Per-Trap Count of Sequences assigned to bacterial genera and OTUs vs total GLGQ sequence for 16s v6 metabarcode data. 62

74 R 2 Value order family genus species OTUs CO1 R2 value 16S v3-v4 R2 value Figure 13: R 2 value of ADONIS tests of between-site beta diversity, by marker and by taxonomic level, conditioned on site. Note: no species level data available for 16, COI beta diversity not significant at order level. 63

75 a) b) c) d) e) Figure 14: Homogeneity of Dispersion for metabarcoded COI sequences by site, for order (a), family (b), genus (c), species (d) and OTUs (e). 64

76 a) b) Bosque Humedo Firebreak San Emilio c) d) Bosque Humedo Bosque Humedo Firebreak San Emilio Firebreak San Emilio e) Bosque Humedo Firebreak San Emilio Figure 15: Principle Coordinate Analysis (PCoA) for metabarcoded COI sequences by site, for order (a), family (b), genus (c), species (d) and OTUs (e). 65

77 a) b) c) d) Figure 16: Homogeneity of Dispersion for 16S v3-v4 barcoded sequences by site, for order (a), family (b), genus (c) and OTUs (d). 66

78 a) b) Bosque Humedo Firebreak Firebreak Bosque Humedo San Emilio San Emilio c) d) Firebreak Firebreak Bosque Humedo Bosque Humedo San Emilio San Emilio Figure 17: Principle Coordinate Analysis (PCoA) for 16S v3-v4 metabarcoded sequences by site, for order (a), family (b), genus (c) and OTUs (d). 67

79 References: Aird, D., Ross, M. G., Chen, W. S., Danielsson, M., Fennell, T., Russ, C.,... & Gnirke, A. (2011). Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol, 12(2), R18. Arroyo Mora, J. P., Sánchez Azofeifa, G. A., Kalacska, M. E., Rivard, B., Calvo Alvarado, J. C., & Janzen, D. H. (2005a). Secondary forest detection in a Neotropical dry forest landscape using Landsat 7 ETM+ and IKONOS Imagery. Biotropica, 37(4), Arroyo-Mora, J. P., Sánchez-Azofeifa, G. A., Rivard, B., Calvo, J. C., & Janzen, D. H. (2005b). Dynamics in landscape structure and composition for the Chorotega region, Costa Rica from 1960 to Agriculture, Ecosystems & Environment, 106(1), Baird, D. J., & Hajibabaei, M. (2012). Biomonitoring 2.0: a new paradigm in ecosystem assessment made possible by next generation DNA sequencing. Molecular ecology, 21(8), Basset, Y., Cizek, L., Cuénoud, P., Didham, R. K., Guilhaumon, F., Missa, O.,... & Leponce, M. (2012). Arthropod diversity in a tropical forest. Science, 338(6113), Bertrand, C., Janzen, D. H., Hallwachs, W., Burns, J. M., Gibson, J. F., Shokralla, S., & Hajibabaei, M. (2014). Mitochondrial and nuclear phylogenetic analysis with Sanger and next-generation sequencing shows that, in Area de Conservacion Guanacaste, northwestern Costa Rica, the skipper butterfly named Urbanus belli (family Hesperiidae) comprises three morphologically cryptic species. BMC Evolutionary Biology, 14(1), 153. Boinski, S., & Fowler, N. L. (1989). Seasonal patterns in a tropical lowland forest. Biotropica, Brown, J., Janzen, D., Hallwachs, W., Zahiri, R., Hajibabaei, M., & Hebert, P. D. (2014). Cracking complex taxonomy of Costa Rican moths: Anacrusis Zeller (Lepidoptera: Tortricidae). Journal of Lepidopterists Society, 68(4), Calvo-Alvarado, J., McLennan, B., Sánchez-Azofeifa, A., & Garvin, T. (2009). Deforestation and forest restoration in Guanacaste, Costa Rica: putting conservation policies in context. Forest Ecology and Management, 258(6), Cardinale, B. J., Duffy, J. E., Gonzalez, A., Hooper, D. U., Perrings, C., Venail, P., & Naeem, S. (2012). Biodiversity loss and its impact on humanity. Nature, 486(7401), Clare, E. L. (2014). Molecular detection of trophic interactions: emerging trends, distinct advantages, significant considerations and conservation applications. Evolutionary applications, 7(9), Coissac, E., Riaz, T., & Puillandre, N. (2012). Bioinformatic challenges for DNA metabarcoding of plants and animals. Molecular Ecology, 21(8), Collins, R. A., & Cruickshank, R. H. (2013). The seven deadly sins of DNA barcoding. Molecular Ecology Resources, 13(6),

80 Deagle, B. E., Jarman, S. N., Coissac, E., Pompanon, F., & Taberlet, P. (2014). DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match. Biology Letters, 10(9), DeSalle, R., Egan, M. G., & Siddall, M. (2005). The unholy trinity: taxonomy, species delimitation and DNA barcoding. Philosophical Transactions of the Royal Society of London B: Biological Sciences, 360(1462), DeSalle, R. (2006). Species discovery versus species identification in DNA barcoding efforts: response to Rubinoff. Conservation Biology, 20(5), Dufrêne, M., & Legendre, P. (1997). Species assemblages and indicator species: the need for a flexible asymmetrical approach. Ecological monographs, 67(3), Dunn, R.R. (2004) Recovery of faunal communities during tropical forest regeneration. Conservation Biology 18(2), Edgar, R. C. (2010). Search and clustering orders of magnitude faster than BLAST. Bioinformatics, 26(19), Ettema, C.H. and Yeates, G.W. (2003) Nested spatial biodiversity patterns of nematode genera in a New Zealand forest and pasture soil. Soil Biology & Biochemistry. 35, Fedigan, L. M., & Jack, K. (2001). Neotropical primates in a regenerating Costa Rican dry forest: a comparison of howler and capuchin population patterns. International Journal of Primatology, 22(5), Folmer, O., Black, M., Hoeh, W., Lutz, R., & Vrijenhoek, R. (1994). DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Molecular marine biology and biotechnology, 3(5), 294. Francis, C. M., Borisenko, A. V., Ivanova, N. V., Eger, J. L., Lim, B. K., Guillén-Servent, A.,... & Hebert, P. D. (2010). The role of DNA barcodes in understanding and conservation of mammal diversity in Southeast Asia. PLoS One, 5(9), e Gaston, K. J., & Spicer, J. I. (2004). Biodiversity: an introduction, (Ed. 2). Blackwell Science. Gibson, J., Shokralla, S., Porter, T. M., King, I., van Konynenburg, S., Janzen, D. H.,... & Hajibabaei, M. (2014). Simultaneous assessment of the macrobiome and microbiome in a bulk sample of tropical arthropods through DNA metasystematics. Proceedings of the National Academy of Sciences, 111(22), Grauer, D., & Li, W. H (2000) Fundamentals of molecular evolution. Sinauer. Goodman, D. (1975). The theory of diversity-stability relationships in ecology. Quarterly Review of Biology, Gotelli, N.J. and A.M. Ellison. (2013). EcoSimR Hajibabaei, M., Janzen, D. H., Burns, J. M., Hallwachs, W., & Hebert, P. D. (2006a). DNA barcodes distinguish species of tropical Lepidoptera. Proceedings of the National Academy of Sciences of the United States of America, 103(4),

81 Hajibabaei, M., Smith, M., Janzen, D. H., Rodriguez, J. J., Whitfield, J. B., & Hebert, P. D. (2006b). A minimalist barcode can identify a specimen whose DNA is degraded. Molecular Ecology Notes, 6(4), Hajibabaei, M., Singer, G. A., Clare, E. L., & Hebert, P. D. (2007a). Design and applicability of DNA arrays and DNA barcodes in biodiversity monitoring. BMC Biology, 5(1), 24. Hajibabaei, M., Singer, G. A., Hebert, P. D., & Hickey, D. A. (2007b). DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. Trends in Genetics, 23(4), Hajibabaei, M., Shokralla, S., Zhou,., Singer, G. A., & Baird, D. J. (2011). Environmental barcoding: a next-generation sequencing approach for biomonitoring applications using river benthos. PLoS One, 6(4), e Hajibabaei, M. (2012). The golden age of DNA metasystematics. Trends in Genetics, 28(11), Hajibabaei, M., Spall, J. L., Shokralla, S., & van Konynenburg, S. (2012). Assessing biodiversity of a freshwater benthic macroinvertebrate community through non-destructive environmental barcoding of DNA from preservative ethanol. BMC ecology, 12(1), 28. Hebert, P. D., Cywinska, A., & Ball, S. L. (2003). Biological identifications through DNA barcodes. Proceedings of the Royal Society of London: Biological Sciences, 270(1512), Hebert, P. D. N., Stoeckle, M. Y., Zemlak, T. S., & Francis, C. M. (2004a). Identification of birds through DNA barcodes. PLoS Biology, 2(10), e312. Hebert, P. D., Penton, E. H., Burns, J. M., Janzen, D. H., & Hallwachs, W. (2004b). Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proceedings of the National Academy of Sciences of the United States of America, 101(41), Hollingsworth, P. M., Forrest, L. L., Spouge, J. L., Hajibabaei, M., Ratnasingham, S., van der Bank, M.,... & Wilkinson, M. J. (2009). A DNA barcode for land plants. Proceedings of the National Academy of Sciences, 106(31), Hrcek, J. A. N., Miller, S. E., Quicke, D. L., & Smith, M. (2011). Molecular detection of trophic links in a complex insect host parasitoid food web. Molecular Ecology Resources, 11(5), Huson, D. H., Auch, A. F., Qi, J., & Schuster, S. C. (2007). MEGAN analysis of metagenomic data. Genome research, 17(3), Ives, A. R., & Carpenter, S. R. (2007). Stability and Diversity of Ecosystems. Science, 317, 58. James, T. Y., Kauff, F., Schoch, C. L., Matheny, P. B., Hofstetter, V., Cox, C. J.,... & Spotts, R. A. (2006). Reconstructing the early evolution of Fungi using a six-gene phylogeny. Nature, 443(7113), Janzen, D. H. (1966). Coevolution of mutualism between ants and acacias in Central America. Evolution,

82 Janzen, D. H. (1967). Synchronization of sexual reproduction of trees within the dry season in Central America. Evolution, Janzen, D. H., & Schoener, T. W. (1968). Differences in insect abundance and diversity between wetter and drier sites during a tropical dry season. Ecology, Janzen, D. H. (1970). Herbivores and the number of tree species in tropical forests. American Naturalist, Janzen, D. H. (1975). Pseudomyrmex nigropilosa: a parasite of a mutualism. Science, 188(4191), Janzen, D. H. (1986). Guanacaste National Park: tropical ecological and cultural restoration. San José: Editorial Universidad Estatal a Distancia. Janzen, D. H. (1988a). Tropical dry forests. The Most Endangered Major Tropical Ecosystem, Pp en: EO Wilson, Biodiversity. Janzen, D. H. (1988b). Management of habitat fragments in a tropical dry forest: Growth. Annals of the Missouri Botanical Garden, Janzen, D. (1999). Gardenification of tropical conserved wildlands: multitasking, multicropping, and multiusers. Proceedings of the National Academy of Sciences, 96(11), Janzen, D. H. (2000). Costa Rica's Area de Conservación Guanacaste: a long march to survival through nondamaging biodevelopment. Biodiversity, 1(2), Janzen, D. H., Hajibabaei, M., Burns, J. M., Hallwachs, W., Remigio, E., & Hebert, P. D. (2005). Wedding biodiversity inventory of a large and complex Lepidoptera fauna with DNA barcoding. Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1462), Janzen, D. H., Hallwachs, W., Blandin, P., Burns, J. M., Cadiou, J., Chacon, I.,... & Wilson, J. J. (2009). Integration of DNA barcoding into an ongoing inventory of complex tropical biodiversity. Molecular Ecology Resources, 9(s1), Joly, S., Davies, T. J., Archambault, A., Bruneau, A., Derry, A., Kembel, S. W.,... & Wheeler, T. A. (2014). Ecology in the age of DNA barcoding: the resource, the promise and the challenges ahead. Molecular Ecology Resources, 14(2), Jousselin, E., Van Noort, S., Berry, V., Rasplus, J. Y., Rønsted, N., Erasmus, J. C., & Greeff, J. M. (2008). One fig to bind them all: host conservatism in a fig wasp community unraveled by cospeciation analyses among pollinating and nonpollinating fig wasps. Evolution, 62(7), Illumina, Inc. (2015). MiSeq Gene & Small Genome Sequencer. Retrieved March 21, 2015, from Kalacska, M., Sanchez-Azofeifa, G. A., Calvo-Alvarado, J. C., Quesada, M., Rivard, B., & Janzen, D. H. (2004). Species composition, similarity and diversity in three successional stages of a seasonally dry tropical forest. Forest ecology and management, 200(1),

83 Kimmerer, W. J. (1984). Diversity/Stabililty: A Criticism. Ecology, Kitching, R. L., Bickel, D., Creagh, A. C., Hurley, K., & Symonds, C. (2004). The biodiversity of Diptera in Old World rain forest surveys: a comparative faunistic analysis. Journal of Biogeography, 31(7), Koleff, P., Gaston, K. J., & Lennon, J. J. (2003). Measuring beta diversity for presence absence data. Journal of Animal Ecology, 72(3), Kress, W. J., & Erickson, D. L. (2012). DNA barcodes: methods and protocols. Humana Press. Krishnamurthy, P. K., & Francis, R. A. (2012). A critical review on the utility of DNA barcoding in biodiversity conservation. Biodiversity and Conservation, 21(8), Liu, L., Li, Y., Li, S., Hu, N., He, Y., Pong, R.,... & Law, M. (2011). Comparison of next-generation sequencing systems. Journal of Biomedicine & Biotechnology, 2012, Longino, J. T., Coddington, J., & Colwell, R. K. (2002). The ant fauna of a tropical rain forest: estimating species richness three different ways. Ecology, 83(3), Luckey, J. A., Drossman, H., Kostichka, A. J., Mead, D. A., D'Cunha, J., Norris, T. B., & Smith, L. M. (1990). High speed DNA sequencing by capillary electrophoresis. Nucleic Acids Research, 18(15), MacArthur, R. (1955). Fluctuations of animal populations and a measure of community stability. Ecology, 36(3), MacKenzie, D. I., Bailey, L. L., & Nichols, J. (2004). Investigating species co occurrence patterns when species are detected imperfectly. Journal of Animal Ecology, 73(3), Masella, A. P., Bartram, A. K., Truszkowski, J. M., Brown, D. G., & Neufeld, J. D. (2012). PANDAseq: paired-end assembler for illumina sequences. BMC bioinformatics, 13(1), 31. McGeoch, M. A. (1998). The selection, testing and application of terrestrial insects as bioindicators. Biological Reviews of the Cambridge Philosophical Society, 73(02), McGinnis, S., & Madden, T. L. (2004). BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic acids research, 32(suppl 2), W20-W25. Meusnier, I., Singer, G. A., Landry, J. F., Hickey, D. A., Hebert, P. D., & Hajibabaei, M. (2008). A universal DNA mini-barcode for biodiversity analysis. BMC Genomics, 9(1), 214. Moritz, C., & Cicero, C. (2004). DNA barcoding: promise and pitfalls. PLoS Biology, 2(10), e354. Mullis, K. B., Erlich, H. A., Arnheim, N., Horn, G. T., Saiki, R. K., & Scharf, S. J. (1987). U.S. Patent No. 4,683,195. Washington, DC: U.S. Patent and Trademark Office. Murphy, P. G., & Lugo, A. E. (1986). Ecology of tropical dry forest. Annual review of ecology and systematics, Noyes, J. S. (1989). A study of five methods of sampling Hymenoptera (Insecta) in a tropical rainforest, with special reference to the Parasitica. Journal of Natural History, 23(2),

84 Oksanen, J., Kindt, R., Legendre, P., O Hara, B., Stevens, M. H. H., Oksanen, M. J., & Suggests, M. A. S. S. (2007). The vegan package. Community ecology package. Peterson, G., Allen, C. R., & Holling, C. S. (1998). Ecological resilience, biodiversity, and scale. Ecosystems, 1(1), Portillo-Quintero, C. A., & Sánchez-Azofeifa, G. A. (2010). Extent and conservation of tropical dry forests in the Americas. Biological Conservation, 143(1), Quail, M. A., Smith, M., Coupland, P., Otto, T. D., Harris, S. R., Connor, T. R.,... & Gu, Y. (2012). A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics, 13(1), 341. R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL Rey Benayas. J.M., Newton, A.C., Diaz, A. and Bullock, J.M. (2009) Enhancement of Biodiversity and Ecosystem Services by Ecological Restoration: A Meta-Analysis. Science, 325, Rubinoff, D. (2006). Utility of mitochondrial DNA barcodes in species conservation. Conservation Biology, 20(4), Rubinoff, D., Cameron, S., & Will, K. (2006). A genomic perspective on the shortcomings of mitochondrial DNA for barcoding identification. Journal of Heredity, 97(6), Saiki, R. K., Gelfand, D. H., Stoffel, S., Scharf, S. J., Higuchi, R., Horn, G. T., & Erlich, H. A. (1988). Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science, 239(4839), Sanger, F., Nicklen, S., & Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proceedings of the National Academy of Sciences, 74(12), Schmieder, R., & Edwards, R. (2011). Quality control and preprocessing of metagenomic datasets. Bioinformatics, 27(6), Schoch, C. L., Seifert, K. A., Huhndorf, S., Robert, V., Spouge, J. L., Levesque, C. A.,... & Griffith, G. W. (2012). Nuclear ribosomal internal transcribed spacer (ITS) region as a universal DNA barcode marker for Fungi. Proceedings of the National Academy of Sciences, 109(16), Sheppard, S. K., Henneman, M. L., Memmott, J., & Symondson, W. O. C. (2004). Infiltration by alien predators into invertebrate food webs in Hawaii: a molecular approach. Molecular Ecology, 13(7), Shokralla, S., Spall, J. L., Gibson, J. F., & Hajibabaei, M. (2012). Next-generation sequencing technologies for environmental DNA research. Molecular Ecology, 21(8), Shokralla, S., Gibson, J. F., Nikbakht, H., Janzen, D. H., Hallwachs, W., & Hajibabaei, M. (2014). Next generation DNA barcoding: using next generation sequencing to enhance and accelerate DNA barcode capture from single specimens. Molecular Ecology Resources, 14(5), Smith, M. A., Woodley, N. E., Janzen, D. H., Hallwachs, W., & Hebert, P. D. (2006). DNA barcodes reveal cryptic host-specificity within the presumed polyphagous members of a genus of 73

85 parasitoid flies (Diptera: Tachinidae). Proceedings of the National Academy of Sciences, 103(10), Smith, M. A., Wood, D. M., Janzen, D. H., Hallwachs, W., & Hebert, P. D. (2007). DNA barcodes affirm that 16 species of apparently generalist tropical parasitoid flies (Diptera, Tachinidae) are not all generalists. Proceedings of the National Academy of Sciences, 104(12), Soule, M. E., & Wilcox, B. A. (1980). Conservation biology. An evolutionary-ecological perspective. Sinauer Associates, Inc.. Taberlet, P., Coissac, E., Pompanon, F., Brochmann, C., & Willerslev, E. (2012). Towards nextgeneration biodiversity assessment using DNA metabarcoding. Molecular Ecology, 21(8), Taylor, H. R., & Harris, W. E. (2012). An emergent science on the brink of irrelevance: a review of the past 8 years of DNA barcoding. Molecular Ecology Resources, 12(3), Temple, S. A., & Wiens, J. A. (1989). Bird populations and environmental changes: can birds be bioindicators. American Birds, 43(2), Thompson, I., Mackey, B., McNulty, S., & Mosseler, A. (2009). Forest resilience, biodiversity, and climate change. In A synthesis of the biodiversity/resilience/stability relationship in forest ecosystems. Secretariat of the Convention on Biological Diversity, Montreal. Technical Series (Vol. 43). Tringe, S. G., & Hugenholtz, P. (2008). A renaissance for the pioneering 16S rrna gene. Current Opinion in Microbiology, 11(5), Townes, H. (1972). A light-weight Malaise trap. Entomological news, 83, United Nations (1992) Convention on Biological Diversity. Rio de Janeiro, Brazil: Convention on Biological Diversity Voelkerding, K. V., Dames, S. A., & Durtschi, J. D. (2009). Next-generation sequencing: from basic research to diagnostics. Clinical Chemistry, 55(4), Wallace, L. J., Boilard, S. M., Eagle, S. H., Spall, J. L., Shokralla, S., & Hajibabaei, M. (2012). DNA barcodes for everyday life: Routine authentication of Natural Health Products. Food Research International, 49(1), Wang, Q., Garrity, G. M., Tiedje, J. M., & Cole, J. R. (2007). Naive Bayesian classifier for rapid assignment of rrna sequences into the new bacterial taxonomy. Applied and environmental microbiology, 73(16), Whittaker, R. H. (1970). Communities and ecosystems. MacMillan, New York. Whittaker, R. J., Willis, K. J., & Field, R. (2001). Scale and species richness: towards a general, hierarchical theory of species diversity. Journal of Biogeography, 28(4), Will, K. W., & Rubinoff, D. (2004). Myth of the molecule: DNA barcodes for species cannot replace morphology for identification and classification. Cladistics, 20(1),

86 Will, K. W., Mishler, B. D., & Wheeler, Q. D. (2005). The perils of DNA barcoding and the need for integrative taxonomy. Systematic Biology, 54(5), Wilson, A. C., Carlson, S. S., & White, T. J. (1977). Biochemical evolution. Annual Review of Biochemistry, 46(1), Wilson, E. (1988). Biodiversity. In National Forum on Biodiversity. Washington DC. Wilson, K. H. (1995). Molecular biology as a tool for taxonomy. Clinical infectious diseases, 20(2), S117-S121. Welsh, H.H. and Ollivier, L.M. (1998) Stream Amphibians as Indicators of Ecosystem Stress: A Case Study From California's Redwoods. Ecological Applications 8(4), Wong, E. H. K., & Hanner, R. H. (2008). DNA barcoding detects market substitution in North American seafood. Food Research International, 41(8), Yu, D. W., Ji, Y., Emerson, B. C., Wang,., Ye, C., Yang, C., & Ding, Z. (2012). Biodiversity soup: metabarcoding of arthropods for rapid biodiversity assessment and biomonitoring. Methods in Ecology and Evolution, 3(4),

87 Appendix I: COI barcode identifications by site for order, family, genus and species. Order Family Genus Species FB SE BH Diplostraca Chydoridae Kurzia Kurzia media* Diptera Ceratopogonidae incertae sedis Ceratopogonidae sp. BOLD:AAN5170 Pipunculidae Eudorylas Eudorylas sp. T13 Tachinidae Atactosturmia Unassigned Ceromya Ceromya sp. Wood03DHJ09 Drino Unassigned Spathidexia Spathidexia sp. Wood03 Tachinidae gen. tachjanzen01 Tachinidae gen. tachjanzen01 sp. Janzen01 Tachinidae gen. tachjanzen01 sp. Janzen89 Tachinidae gen. tachjanzen01 sp. Tachinidae gen. tachjanzen01 sp. Janzen89 Janzen89 Collembola Entomobryidae incertae sedis Entomobryidae sp. DPCOL36332 Eulophinae gen. eulohansson01 sp. Hemiptera Flatidae Flatidae gen. flatidbiolep01 BioLep01* Hymenoptera Apidae Apis Apis mellifera Braconidae Aleiodes Aleiodes sp. BOLD:AAH8710 Dolichogenidea Dolichogenidea sp. Janzen30 Parapanteles Parapanteles sp. Whitfield84 Stantonia Stantonia sp. Janzen04 Formicidae 76

88 Azteca Azteca beltii Camponotus Camponotus novogranadensis Camponotus sp. MAS008 Camponotus sp. MAS009 Camponotus sp. MAS010 Camponotus sp. MAS011 Camponotus sp. MAS026 Carebara Carebara aff. brevipilosa MAS001 Cephalotes Cephalotes sp. MAS003 Crematogaster Crematogaster sp. MAS001 Crematogaster sp. MAS005 Dolichoderus Dolichoderus bispinosus Eciton Eciton burchellii Leptogenys Leptogenys sp. MAS005 Odontomachus Odontomachus bauri Pheidole Pheidole sp. MAS051 Pheidole susannae Pseudomyrmex Pseudomyrmex flavicornis Pseudomyrmex nigrocinctus Pseudomyrmex sp. BOLD:ABY4068 Pseudomyrmex sp. BOLD:ABZ7041 Pseudomyrmex spinicola Ichneumonidae Ichneumonidae gen. campojanzen01 Ichneumonidae gen. campojanzen01 sp. Janzen03 77

89 Ichneumonidae gen. ichjanzen01 Ichneumonidae gen. ichjanzen01 sp. Janzen01 incertae sedis incertae sedis Hymenoptera sp. BOLD:AAB3996 Hymenoptera sp. BOLD:AAF4483 Lepidoptera Acrolophidae Acrolophidae gen. acrojanzen01 Acrolophidae gen. acrojanzen01 sp. Janzen01 Arctiidae Cyana Unassigned Uranophora Napata leucotelus Crambidae Agathodes Agathodes sp. BOLD:AAH6214 Spilomelinae gen. spilobiolep01 Spilomelinae gen. spilobiolep01 sp. BioLep265 Erebidae Eulepidotis Eulepidotis addens Gelechiidae Pexicopia Unassigned Geometridae Cargolia Cargolia sp. BOLD:AAI3696 Neoselenia Neoselenia banasa Scopula Scopula sp. BioLep28 Scopula sp. BioLep30 Hesperiidae Morys Morys valda Parphorus Parphorus decora Remella Remella vopiscus Staphylus Staphylus azteca Synapte Unassigned incertae sedis incertae sedis Lepidoptera sp. BOLD:AAA0689 Lepidoptera sp. BOLD:AAA

90 Lepidoptera sp. BOLD:AAA0943 Lepidoptera sp. BOLD:AAA0988 Lepidoptera sp. BOLD:AAA1029 Lepidoptera sp. BOLD:AAA1033 Lepidoptera sp. BOLD:AAA1044 Lepidoptera sp. BOLD:AAA1062 Lepidoptera sp. BOLD:AAA1129 Lepidoptera sp. BOLD:AAA3461 Lepidoptera sp. BOLD:AAA4032 Lepidoptera sp. BOLD:AAB0282 Lepidoptera sp. BOLD:AAB8022 Lepidoptera sp. BOLD:AAC9420 Lepidoptera sp. BOLD:AAF0250 Lepidoptera sp. BOLD:AAF7490 Lepidoptera sp. BOLD:AAH5486 Lepidoptera sp. BOLD:AAH5488 Lepidoptera sp. BOLD:AAH5499 Lepidoptera sp. BOLD:AAH5500 Lepidoptera sp. BOLD:AAH5501 Lepidoptera sp. BOLD:AAH5516 Lepidoptera sp. BOLD:AAH5710 Lepidoptera sp. BOLD:AAH5711 Lepidoptera sp. BOLD:AAH5731 Lepidoptera sp. BOLD:AAH5734 Lepidoptera sp. BOLD:AAH5765 Lepidoptera sp. BOLD:AAH5819 Lepidoptera sp. BOLD:AAH5822 Lepidoptera sp. BOLD:AAH5824 Lepidoptera sp. BOLD:AAH5848 Lepidoptera sp. BOLD:AAI2768 Lepidoptera sp. BOLD:AAM

91 Lepidoptera sp. BOLD:AAQ1419 Lepidoptera sp. BOLD:AAU7691 Lepidoptera sp. BOLD:ABA5692 Lepidoptera sp. BOLD:ABA5695 Lepidoptera sp. BOLD:ABA5698 Lepidoptera sp. BOLD:ABA8140 Lycaenidae Calycopis Calycopis sp. origodhj01 Lycaenidae gen. lycaenjanzen01 Lycaenidae gen. lycaenjanzen01 sp. Janzen01 Noctuidae Carteris Carteris oculatalis Condica Condica albigera Coscaga Coscaga picatalis Eulepidotis Eulepidotis addens Noctuidae gen. noctbiolep01 Noctuidae gen. noctbiolep01 sp. BioLep325 Noctuidae gen. noctbiolep01 sp. BioLep423 Noctuidae gen. noctbiolep01 sp. BioLep480 Noctuidae gen. noctbiolep01 sp. BioLep850 Oruza Oruza sp. Poole01 Renia Renia sp. BioLep05 Nymphalidae Cissia Cissia similis Cissia sp. Janzen04 Hermeuptychia Hermeuptychia sp. hermeseco03 Oecophoridae Parocystola Parocystola holodryas Riodinidae Detritivora Detritivora barnesi 80

92 Mantodea Mantidae Mantidae gen. mantisjanzen01 Mantidae gen. mantisjanzen01 sp. Janzen01 81

93 Appendix II: 16S barcode identifications by site for order, family and genus Order Family Genera FB SE BH Actinomycetales Actinomycetaceae Actinomyces Brevibacteriaceae Brevibacterium Cellulomonadaceae Cellulomonas Corynebacteriaceae Corynebacterium Microbacteriaceae Curtobacterium Leifsonia Leucobacter Pseudonocardiaceae Amycolatopsis Streptomycetaceae Streptomyces Tsukamurellaceae Tsukamurella Aeromonadales Aeromonadaceae Aeromonas Bacillales Listeriaceae Listeria Bacteroidales Bacteroidaceae Bacteroides Porphyromonadaceae Barnesiella Dysgonomonas Paludibacter Parabacteroides Petrimonas Tannerella Prevotellaceae Paraprevotella Prevotella Rikenellaceae Alistipes Rikenella Bifidobacteriales Bifidobacteriaceae Bifidobacterium Burkholderiales Alcaligenaceae Achromobacter Advenella Bordetella 82

94 Pusillimonas Burkholderiaceae Burkholderia Pandoraea Comamonadaceae Comamonas Lampropedia Variovorax Oxalobacteraceae Oxalicibacterium Campylobacterales Campylobacteraceae Campylobacter Chlamydiales Parachlamydiaceae Neochlamydia Clostridiales Lachnospiraceae Clostridium lva Clostridium lvb Coprococcus Dorea Lachnospiracea_ic Ruminococcaceae Anaerofilum Anaerotruncus Clostridium IV Ethanoligenens Oscillibacter Papillibacter Pseudoflavonifracto r Ruminococcus Desulfovibrionales Desulfovibrionaceae Desulfovibrio Enterobacteriales Enterobacteriaceae Arsenophonus Biostraticola Brenneria Buttiauxella Cedecea Enterobacter Erwinia Escherichia/Shigella Klebsiella Kluyvera Leclercia 83

95 Leminorella Mangrovibacter Morganella Pantoea Pluralibacter Pragia Proteus Providencia Raoultella Salmonella Serratia Shimwellia Trabulsiella Entomoplasmatale s Entomoplasmataceae Entomoplasma Mesoplasma Spiroplasmataceae Spiroplasma Erysipelotrichales Erysipelotrichaceae Allobaculum Clostridium VIII Flavobacteriales Flavobacteriaceae Chryseobacterium Elizabethkingia Epilithonimonas Myroides Ornithobacterium Riemerella Streptococcaceae Lactococcus Fusobacteriales Leptotrichiaceae Sebaldella Lactobacillales Enterococcaceae Enterococcus Melissococcus Vagococcus Lactobacillaceae Lactobacillus Paralactobacillus 84

96 Leuconostocaceae Fructobacillus Leuconostoc Streptococcaceae Lactococcus Streptococcus Legionellales Coxiellaceae Diplorickettsia Neisseriales Neisseriaceae Paludibacterium Uruburuella Oceanospirillales Halomonadaceae Zymobacter Opitutales Opitutaceae Opitutus Orbales Orbaceae Gilliamella Orbus Pasteurellales Pasteurellaceae Chelonobacter Pseudomonadales Moraxellaceae Acinetobacter Alkanindiges Pseudomonadaceae Pseudomonas Rhizobiales Aurantimonadaceae Aurantimonas Bartonellaceae Bartonella Brucellaceae Brucella Ochrobactrum Paenochrobactrum Pseudochrobactrum Hyphomicrobiaceae Devosia Methylobacteriaceae Methylobacterium Phyllobacteriaceae Mesorhizobium Phyllobacterium Rhizobiaceae Rhizobium Rhodobacterales Rhodobacteraceae Paracoccus Rhodospirillales Acetobacteraceae Asaia Gluconobacter 85

97 Neokomagataea Saccharibacter Rhodospirillaceae Rhodospirillum Telmatospirillum Rickettsiales Rickettsiaceae Orientia Rickettsia Sphingobacteriales Flammeovirgaceae Fabibacter Sphingobacteriaceae Pedobacter Sphingobacterium Sphingomonadales Sphingomonadaceae Sphingomonas anthomonadales anthomonadaceae Stenotrophomonas 86

98 Appendix III: COI Taxonomy Sequence Matrices N.B: OTU matrix available as supplementary.csv file upon request Order 1A 1B 1C 2A 2B 2C 3A 3B 3C Cladocera Diptera Entomobryomorpha Hemiptera Hymenoptera Lepidoptera Mantodea Family 1A 1B 1C 2A 2B 2C 3A 3B 3C Acrolophidae Apidae Arctiidae Braconidae Ceratopogonidae Chydoridae Crambidae Entomobryidae Erebidae Flatidae Formicidae Gelechiidae Geometridae Hesperiidae Ichneumonidae Lycaenidae Mantidae Noctuidae Nymphalidae Oecophoridae Pipunculidae Riodinidae

99 Tachinidae Unassigned Diptera Unassigned Hymenoptera Unassigned Lepidoptera Hymenoptera IC Lepidoptera IC Genus Acrolophidae gen. acrojanzen Agathodes Aleiodes Apanteles Apis Atactosturmia Azteca Calycopis Camponotus Carebara Cargolia Carteris Cephalotes Ceromya Cissia Condica Coscaga Crematogaster Cyana Detritivora Dolichoderus Dolichogenidea Drino Eciton Elachistidae gen. elachjanzen Eudorylas Eulepidotis

100 Flatidae gen. flatidbiolep Hermeuptychia Ichneumonidae gen. campojanzen Ichneumonidae gen. ichjanzen Kurzia Leptogenys Lycaenidae gen. lycaenjanzen Mantidae gen. mantisjanzen Morys Neoselenia Noctuidae gen. noctbiolep Odontomachus Oruza Parapanteles Parocystola Parphorus Pexicopia Pheidole Pseudomyrmex Remella Renia Scopula Spathidexia Spilomelinae gen. spilobiolep Stantonia Staphylus Synapte Tachinidae gen. tachjanzen Tachinidae gen. tachjanzen01 sp. Janzen

101 Unassigned Uranophora Species 1A 1B 1C 2A 2B 2C 3A 3B 3C Acrolophidae gen. acrojanzen01 sp. Janzen Agathodes sp. BOLD:AAH Aleiodes sp. BOLD:AAH Apis mellifera Azteca beltii Calycopis sp. origodhj Camponotus novogranadensis Camponotus sp. MAS Camponotus sp. MAS Camponotus sp. MAS Camponotus sp. MAS Camponotus sp. MAS Carebara aff. brevipilosa MAS Cargolia sp. BOLD:AAI Carteris oculatalis Cephalotes sp. MAS Ceratopogonidae sp. BOLD:AAN Ceromya sp. Wood03DHJ Cissia similis Cissia sp. Janzen Condica albigera Coscaga picatalis Crematogaster sp. MAS Crematogaster sp. MAS Detritivora barnesi Dolichoderus bispinosus Dolichogenidea sp. Janzen Eciton burchellii

102 Elachistidae gen. elachjanzen01 sp. Janzen Entomobryidae sp. DPCOL Eudorylas sp. T Eulepidotis addens Eulophinae gen. eulohansson01 sp. BioLep Hermeuptychia sp. hermeseco Hymenoptera sp. BOLD:AAB Hymenoptera sp. BOLD:AAF Ichneumonidae gen. campojanzen01 sp. Janzen Ichneumonidae gen. ichjanzen01 sp. Janzen Kurzia media Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAA Lepidoptera sp. BOLD:AAB Lepidoptera sp. BOLD:AAB Lepidoptera sp. BOLD:AAC Lepidoptera sp. BOLD:AAF Lepidoptera sp. BOLD:AAF Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH

103 Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAH Lepidoptera sp. BOLD:AAI Lepidoptera sp. BOLD:AAM Lepidoptera sp. BOLD:AAQ Lepidoptera sp. BOLD:AAU Lepidoptera sp. BOLD:ABA Lepidoptera sp. BOLD:ABA Lepidoptera sp. BOLD:ABA Lepidoptera sp. BOLD:ABA Leptogenys sp. MAS Lycaenidae gen. lycaenjanzen01 sp. Janzen Mantidae gen. mantisjanzen01 sp. Janzen Morys valda Napata leucotelus Neoselenia banasa Noctuidae gen. noctbiolep01 sp. BioLep Noctuidae gen. noctbiolep01 sp. BioLep Noctuidae gen. noctbiolep01 sp. BioLep Noctuidae gen. noctbiolep01 sp. BioLep

104 Odontomachus bauri Oruza sp. Poole Parapanteles sp. Whitfield Parocystola holodryas Parphorus decora Pheidole sp. MAS Pheidole susannae Pseudomyrmex flavicornis Pseudomyrmex nigrocinctus Pseudomyrmex sp. BOLD:ABY Pseudomyrmex sp. BOLD:ABZ Pseudomyrmex spinicola Remella vopiscus Renia sp. BioLep Scopula sp. BioLep Scopula sp. BioLep Spathidexia sp. Wood03 Spilomelinae gen. spilobiolep01 sp. BioLep Stantonia sp. Janzen Staphylus azteca Tachinidae gen. tachjanzen01 sp. Janzen Tachinidae gen. tachjanzen01 sp. Janzen

105 Appendix IV: 16S v3v4 Taxonomy Sequence Matrices N.B: OTU matrix available as supplementary.csv file upon request. Order 1A 1B 1C 2A 2B 2C 3A 3B 3C Bacteroidales Bifidobacteriales Burkholderiales Campylobacterales Clostridiales Enterobacteriales Entomoplasmatales Flavobacteriales incertae sedis Lactobacillales Orbales Pseudomonadales Rhizobiales Rhodospirillales Rickettsiales Sphingobacteriales Sphingomonadales anthomonadales Family 1A 1B 1C 2A 2B 2C 3A 3B 3C Acetobacteraceae Alcaligenaceae Bacteroidaceae Bartonellaceae Bifidobacteriaceae Brucellaceae Burkholderiaceae Campylobacteraceae Enterobacteriaceae Enterococcaceae Entomoplasmataceae Flavobacteriaceae incertae sedis Lachnospiraceae Moraxellaceae

106 Orbaceae Oxalobacteraceae Porphyromonadaceae Prevotellaceae Pseudomonadaceae Rickettsiaceae Rikenellaceae Ruminococcaceae Sphingobacteriaceae Sphingomonadaceae Spiroplasmataceae Streptococcaceae anthomonadaceae Genus 1A 1B 1C 2A 2B 2C 3A 3B 3C Achromobacter Acidobacteria_Gp4_genus Acinetobacter Alistipes Alkanindiges Anaerofilum Anaerotruncus Anaerovorax Arsenophonus Asaia Bacteroides Barnesiella Bartonella Bifidobacterium Biostraticola Bordetella Brenneria Brucella Burkholderia Buttiauxella Campylobacter Cedecea Chryseobacterium Clostridium IV Clostridium lva Clostridium lvb

107 Coprococcus Dorea Dysgonomonas Enterobacter Enterococcus Entomoplasma Epilithonimonas Erwinia Escherichia/Shigella Ethanoligenens Gluconobacter Klebsiella Kluyvera Lachnospiracea_incertae_sedis Lactococcus Leclercia Mesoplasma Morganella Myroides Neokomagataea Ochrobactrum Orbus Orientia Ornithobacterium Oxalicibacterium Paenochrobactrum Paludibacter Pantoea Papillibacter Parabacteroides Paraprevotella Pedobacter Petrimonas Pragia Prevotella Proteus Providencia Pseudochrobactrum Pseudoflavonifractor Pseudomonas Raoultella

108 Rickettsia Ruminococcus Saccharibacter Salmonella Serratia Shimwellia Sphingobacterium Sphingomonas Spiroplasma Stenotrophomonas Streptococcus Streptophyta Subdivision5_genera_incertae_sedis Tannerella TM7_genera_incertae_sedis Vagococcus ylella Zymomonas

109 Appendix V: Target Regions of Primer Sets 98

Chapter 8. Biogeographic Processes. Upon completion of this chapter the student will be able to:

Chapter 8. Biogeographic Processes. Upon completion of this chapter the student will be able to: Chapter 8 Biogeographic Processes Chapter Objectives Upon completion of this chapter the student will be able to: 1. Define the terms ecosystem, habitat, ecological niche, and community. 2. Outline how

More information

Earth s Major Terrerstrial Biomes. *Wetlands (found all over Earth)

Earth s Major Terrerstrial Biomes. *Wetlands (found all over Earth) Biomes Biome: the major types of terrestrial ecosystems determined primarily by climate 2 main factors: Depends on ; proximity to ocean; and air and ocean circulation patterns Similar traits of plants

More information

Amy Driskell. Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC

Amy Driskell. Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC DNA Barcoding Amy Driskell Laboratories of Analytical Biology National Museum of Natural History Smithsonian Institution, Wash. DC 1 Outline 1. Barcoding in general 2. Uses & Examples 3. Barcoding Bocas

More information

Global Patterns Gaston, K.J Nature 405. Benefit Diversity. Threats to Biodiversity

Global Patterns Gaston, K.J Nature 405. Benefit Diversity. Threats to Biodiversity Biodiversity Definitions the variability among living organisms from all sources, including, 'inter alia', terrestrial, marine, and other aquatic ecosystems, and the ecological complexes of which they

More information

Ontario Science Curriculum Grade 9 Academic

Ontario Science Curriculum Grade 9 Academic Grade 9 Academic Use this title as a reference tool. SCIENCE Reproduction describe cell division, including mitosis, as part of the cell cycle, including the roles of the nucleus, cell membrane, and organelles

More information

Georgia Performance Standards for Urban Watch Restoration Field Trips

Georgia Performance Standards for Urban Watch Restoration Field Trips Georgia Performance Standards for Field Trips 6 th grade S6E3. Students will recognize the significant role of water in earth processes. a. Explain that a large portion of the Earth s surface is water,

More information

Biosphere Biome Ecosystem Community Population Organism

Biosphere Biome Ecosystem Community Population Organism Ecology ecology - The study of living things and how they relate to their environment Levels of Organization in Ecology organism lowest level one living thing population collection of organisms of the

More information

Taxonomy and Systematics: a broader classification system that also shows evolutionary relationships

Taxonomy and Systematics: a broader classification system that also shows evolutionary relationships Taxonomy: a system for naming living creatures Carrolus Linnaeus (1707-1778) The binomial system: Genus and species e.g., Macrocystis pyrifera (Giant kelp); Medialuna californiensis (halfmoon) Taxonomy

More information

Data Dictionary for Network of Conservation Areas Transcription Reports from the Colorado Natural Heritage Program

Data Dictionary for Network of Conservation Areas Transcription Reports from the Colorado Natural Heritage Program Data Dictionary for Network of Conservation Areas Transcription Reports from the Colorado Natural Heritage Program This Data Dictionary defines terms used in Network of Conservation Areas (NCA) Reports

More information

Evaluating Wildlife Habitats

Evaluating Wildlife Habitats Lesson C5 4 Evaluating Wildlife Habitats Unit C. Animal Wildlife Management Problem Area 5. Game Animals Management Lesson 4. Evaluating Wildlife Habitats New Mexico Content Standard: Pathway Strand: Natural

More information

SGCEP SCIE 1121 Environmental Science Spring 2012 Section Steve Thompson:

SGCEP SCIE 1121 Environmental Science Spring 2012 Section Steve Thompson: SGCEP SCIE 1121 Environmental Science Spring 2012 Section 20531 Steve Thompson: steventhompson@sgc.edu http://www.bioinfo4u.net/ 1 Ecosystems, energy flows, and biomes Today s going to be a bit different.

More information

Chapter 7 Part III: Biomes

Chapter 7 Part III: Biomes Chapter 7 Part III: Biomes Biomes Biome: the major types of terrestrial ecosystems determined primarily by climate 2 main factors: Temperature and precipitation Depends on latitude or altitude; proximity

More information

Ecosystems Chapter 4. What is an Ecosystem? Section 4-1

Ecosystems Chapter 4. What is an Ecosystem? Section 4-1 Ecosystems Chapter 4 What is an Ecosystem? Section 4-1 Ecosystems Key Idea: An ecosystem includes a community of organisms and their physical environment. A community is a group of various species that

More information

Biodiversity Blueprint Overview

Biodiversity Blueprint Overview Biodiversity Blueprint Overview Climate Variability Climate projections for the Glenelg Hopkins Regions suggest that the weather will be hotter and drier in the coming years which will impact on land use,

More information

Resolution XIII.23. Wetlands in the Arctic and sub-arctic

Resolution XIII.23. Wetlands in the Arctic and sub-arctic 13th Meeting of the Conference of the Contracting Parties to the Ramsar Convention on Wetlands Wetlands for a Sustainable Urban Future Dubai, United Arab Emirates, 21-29 October 2018 Resolution XIII.23

More information

Interrelationships. 1. Temperature Wind Fire Rainfall Soil Type Floods Sunlight Altitude Earthquake

Interrelationships. 1. Temperature Wind Fire Rainfall Soil Type Floods Sunlight Altitude Earthquake Interrelationships Abiotic Factors A. A Partial List 1. Temperature Wind Fire Rainfall Soil Type Floods Sunlight Altitude Earthquake B. Aquatic Adaptations 1. Pumping salt out a. Salt water fish 2. Pumping

More information

Phylogenetic diversity and conservation

Phylogenetic diversity and conservation Phylogenetic diversity and conservation Dan Faith The Australian Museum Applied ecology and human dimensions in biological conservation Biota Program/ FAPESP Nov. 9-10, 2009 BioGENESIS Providing an evolutionary

More information

Ecosystems. 1. Population Interactions 2. Energy Flow 3. Material Cycle

Ecosystems. 1. Population Interactions 2. Energy Flow 3. Material Cycle Ecosystems 1. Population Interactions 2. Energy Flow 3. Material Cycle The deep sea was once thought to have few forms of life because of the darkness (no photosynthesis) and tremendous pressures. But

More information

DNA Barcoding: A New Tool for Identifying Biological Specimens and Managing Species Diversity

DNA Barcoding: A New Tool for Identifying Biological Specimens and Managing Species Diversity DNA Barcoding: A New Tool for Identifying Biological Specimens and Managing Species Diversity DNA barcoding has inspired a global initiative dedicated to: Creating a library of new knowledge about species

More information

About me (why am I giving this talk) Dr. Bruce A. Snyder

About me (why am I giving this talk) Dr. Bruce A. Snyder Ecology About me (why am I giving this talk) Dr. Bruce A. Snyder basnyder@ksu.edu PhD: Ecology (University of Georgia) MS: Environmental Science & Policy BS: Biology; Environmental Science (University

More information

Lecture 24 Plant Ecology

Lecture 24 Plant Ecology Lecture 24 Plant Ecology Understanding the spatial pattern of plant diversity Ecology: interaction of organisms with their physical environment and with one another 1 Such interactions occur on multiple

More information

Spheres of Life. Ecology. Chapter 52. Impact of Ecology as a Science. Ecology. Biotic Factors Competitors Predators / Parasites Food sources

Spheres of Life. Ecology. Chapter 52. Impact of Ecology as a Science. Ecology. Biotic Factors Competitors Predators / Parasites Food sources "Look again at that dot... That's here. That's home. That's us. On it everyone you love, everyone you know, everyone you ever heard of, every human being who ever was, lived out their lives. Ecology Chapter

More information

Dynamic and Succession of Ecosystems

Dynamic and Succession of Ecosystems Dynamic and Succession of Ecosystems Kristin Heinz, Anja Nitzsche 10.05.06 Basics of Ecosystem Analysis Structure Ecosystem dynamics Basics Rhythms Fundamental model Ecosystem succession Basics Energy

More information

3rd Six Weeks Pre-Test (Review)

3rd Six Weeks Pre-Test (Review) Name 3rd Six Weeks Pre-Test (Review) Period 1 How can a model of the solar system be used in planning a trip from Earth to another planet? To estimate distance, travel time and fuel cost. B To anticipate

More information

Students will work in small groups to collect detailed data about a variety of living things in the study area.

Students will work in small groups to collect detailed data about a variety of living things in the study area. TEACHER BOOKLET Sampling along a transect Name BIOLOGY Students will work in small groups to collect detailed data about a variety of living things in the study area. Students will need: 10 metre long

More information

10/6/ th Grade Ecology and the Environment. Chapter 2: Ecosystems and Biomes

10/6/ th Grade Ecology and the Environment. Chapter 2: Ecosystems and Biomes 7 th Grade Ecology and the Environment Chapter 2: Ecosystems and Biomes Lesson 1 (Energy Flow in Ecosystems) Each organism in an ecosystem fills an energy role. Producer an organism that can make its own

More information

Ecology Test Biology Honors

Ecology Test Biology Honors Do Not Write On Test Ecology Test Biology Honors Multiple Choice Identify the choice that best completes the statement or answers the question. 1. The study of the interaction of living organisms with

More information

SIF_7.1_v2. Indicator. Measurement. What should the measurement tell us?

SIF_7.1_v2. Indicator. Measurement. What should the measurement tell us? Indicator 7 Area of natural and semi-natural habitat Measurement 7.1 Area of natural and semi-natural habitat What should the measurement tell us? Natural habitats are considered the land and water areas

More information

Computational Ecology Introduction to Ecological Science. Sonny Bleicher Ph.D.

Computational Ecology Introduction to Ecological Science. Sonny Bleicher Ph.D. Computational Ecology Introduction to Ecological Science Sonny Bleicher Ph.D. Ecos Logos Defining Ecology Interactions: Organisms: Plants Animals: Bacteria Fungi Invertebrates Vertebrates The physical

More information

Good Morning! When the bell rings we will be filling out AP Paper work.

Good Morning! When the bell rings we will be filling out AP Paper work. Good Morning! Turn in HW into bin or email to smithm9@fultonschools.org If you do not want to tear the lab out of your notebook take a picture and email it. When the bell rings we will be filling out AP

More information

soils E) the Coriolis effect causes the moisture to be carried sideways towards the earth's oceans, leaving behind dry land masses

soils E) the Coriolis effect causes the moisture to be carried sideways towards the earth's oceans, leaving behind dry land masses MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 1) A biome is characterized primarily by A) flora and fauna. B) soil structure and flora. C) temperature

More information

Microbiome: 16S rrna Sequencing 3/30/2018

Microbiome: 16S rrna Sequencing 3/30/2018 Microbiome: 16S rrna Sequencing 3/30/2018 Skills from Previous Lectures Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq Phylogenetics Lecture 13: Phylogenetics

More information

Bright blue marble floating in space. Biomes & Ecology

Bright blue marble floating in space. Biomes & Ecology Bright blue marble floating in space Biomes & Ecology Chapter 50 Spheres of life Molecules Cells (Tissues Organ Organ systems) Organisms Populations Community all the organisms of all the species that

More information

Most people used to live like this

Most people used to live like this Urbanization Most people used to live like this Increasingly people live like this. For the first time in history, there are now more urban residents than rural residents. Land Cover & Land Use Land cover

More information

Our Living Planet. Chapter 15

Our Living Planet. Chapter 15 Our Living Planet Chapter 15 Learning Goals I can describe the Earth s climate and how we are affected by the sun. I can describe what causes different climate zones. I can describe what makes up an organisms

More information

DNA Barcoding and taxonomy of Glossina

DNA Barcoding and taxonomy of Glossina DNA Barcoding and taxonomy of Glossina Dan Masiga Molecular Biology and Biotechnology Department, icipe & Johnson Ouma Trypanosomiasis Research Centre, KARI The taxonomic problem Following ~250 years of

More information

D. Adaptive Radiation

D. Adaptive Radiation D. Adaptive Radiation One species new species: A new species: B new species: C new species: D Typically occurs when populations of a single species... invade a variety of new habitats, evolve under different

More information

Ecology 312 SI STEVEN F. Last Session: Aquatic Biomes, Review This Session: Plate Tectonics, Lecture Quiz 2

Ecology 312 SI STEVEN F. Last Session: Aquatic Biomes, Review This Session: Plate Tectonics, Lecture Quiz 2 Ecology 312 SI STEVEN F. Last Session: Aquatic Biomes, Review This Session: Plate Tectonics, Lecture Quiz 2 Questions? Warm up: KWL KNOW: On a piece of paper, write down things that you know well enough

More information

water cycle evaporation condensation the process where water vapor the cycle in which Earth's water moves through the environment

water cycle evaporation condensation the process where water vapor the cycle in which Earth's water moves through the environment cycle a series of events that happen over and over water cycle evaporation the cycle in which Earth's water moves through the environment process when the heat of the sun changes water on Earth s surface

More information

Ecology - the study of how living things interact with each other and their environment

Ecology - the study of how living things interact with each other and their environment Ecology Ecology - the study of how living things interact with each other and their environment Biotic Factors - the living parts of a habitat Abiotic Factors - the non-living parts of a habitat examples:

More information

Catastrophic Events Impact on Ecosystems

Catastrophic Events Impact on Ecosystems Catastrophic Events Impact on Ecosystems Hurricanes Hurricanes An intense, rotating oceanic weather system with sustained winds of at least 74 mph and a welldefined eye Conditions for formation: Warm water

More information

Survey Protocols for Monitoring Status and Trends of Pollinators

Survey Protocols for Monitoring Status and Trends of Pollinators Survey Protocols for Monitoring Status and Trends of Pollinators This annex presents the bee monitoring protocols to be applied in the context of monitoring status and trends of pollinators in STEP sites.

More information

Section 8. North American Biomes. What Do You See? Think About It. Investigate. Learning Outcomes

Section 8. North American Biomes. What Do You See? Think About It. Investigate. Learning Outcomes Section 8 North American Biomes What Do You See? Learning Outcomes In this section, you will Define the major biomes of North America and identify your community s biome. Understand that organisms on land

More information

The Diversity of Living Things

The Diversity of Living Things The Diversity of Living Things Biodiversity When scientists speak of the variety of organisms (and their genes) in an ecosystem, they refer to it as biodiversity. A biologically diverse ecosystem, such

More information

CHAPTER 6 & 7 VOCABULARY

CHAPTER 6 & 7 VOCABULARY CHAPTER 6 & 7 VOCABULARY 1. Biome 2. Climate 3. Latitude 4. Altitude 5. Emergent layer 6. Epiphyte 7. Understory 8. Permafrost 9. Wetland 10.Plankton 11.Nekton 12.Benthos 13.Littoral zone 14.Benthic zone

More information

Vanishing Species 5.1. Before You Read. Read to Learn. Biological Diversity. Section. What do biodiversity studies tell us?

Vanishing Species 5.1. Before You Read. Read to Learn. Biological Diversity. Section. What do biodiversity studies tell us? Vanishing Species Before You Read Dinosaurs are probably the most familiar organisms that are extinct, or no longer exist. Many plants and animals that are alive today are in danger of dying out. Think

More information

BIO B.4 Ecology You should be able to: Keystone Vocabulary:

BIO B.4 Ecology You should be able to: Keystone Vocabulary: Name Period BIO B.4 Ecology You should be able to: 1. Describe ecological levels of organization in the biosphere 2. Describe interactions and relationships in an ecosystem.. Keystone Vocabulary: Ecology:

More information

Quantum Dots: A New Technique to Assess Mycorrhizal Contributions to Plant Nitrogen Across a Fire-Altered Landscape

Quantum Dots: A New Technique to Assess Mycorrhizal Contributions to Plant Nitrogen Across a Fire-Altered Landscape 2006-2011 Mission Kearney Foundation of Soil Science: Understanding and Managing Soil-Ecosystem Functions Across Spatial and Temporal Scales Progress Report: 2006007, 1/1/2007-12/31/2007 Quantum Dots:

More information

Go to the following website:

Go to the following website: Name: Date: Go to the following website: http://www.cotf.edu/ete/modules/msese/earthsysflr/biomes.html Answer the following questions from the first page called Biomes on this website. 1. What does climate

More information

Chapter 52 An Introduction to Ecology and the Biosphere

Chapter 52 An Introduction to Ecology and the Biosphere Chapter 52 An Introduction to Ecology and the Biosphere Ecology The study of the interactions between organisms and their environment. Ecology Integrates all areas of biological research and informs environmental

More information

What is insect forecasting, and why do it

What is insect forecasting, and why do it Insect Forecasting Programs: Objectives, and How to Properly Interpret the Data John Gavloski, Extension Entomologist, Manitoba Agriculture, Food and Rural Initiatives Carman, MB R0G 0J0 Email: jgavloski@gov.mb.ca

More information

United States Department of the Interior NATIONAL PARK SERVICE Northeast Region

United States Department of the Interior NATIONAL PARK SERVICE Northeast Region United States Department of the Interior NATIONAL PARK SERVICE Northeast Region June 17, 2017 REQUEST FOR STATEMENTS OF INTEREST and QUALIFICATIONS Project Title: ASSESSMENT OF NATURAL RESOURCE CONDITION

More information

Biomes Section 2. Chapter 6: Biomes Section 2: Forest Biomes DAY ONE

Biomes Section 2. Chapter 6: Biomes Section 2: Forest Biomes DAY ONE Chapter 6: Biomes Section 2: Forest Biomes DAY ONE Of all the biomes in the world, forest biomes are the most widespread and the most diverse. The large trees of forests need a lot of water, so forests

More information

3.1 Distribution of Organisms in the Biosphere Date:

3.1 Distribution of Organisms in the Biosphere Date: 3.1 Distribution of Organisms in the Biosphere Date: Warm up: Study Notes/Questions The distribution of living things is limited by in different areas of Earth. The distribution of life in the biosphere

More information

Ecosystems. Component 3: Contemporary Themes in Geography 32% of the A Level

Ecosystems. Component 3: Contemporary Themes in Geography 32% of the A Level Ecosystems Component 3: Contemporary Themes in Geography 32% of the A Level Component 3 Written exam: 2hrs 15mins Section A Tectonic Hazards One compulsory extended response question 38 marks Section B

More information

Define Ecology. study of the interactions that take place among organisms and their environment

Define Ecology. study of the interactions that take place among organisms and their environment Ecology Define Ecology Define Ecology study of the interactions that take place among organisms and their environment Describe each of the following terms: Biosphere Biotic Abiotic Describe each of the

More information

ECOLOGICAL PLANT GEOGRAPHY

ECOLOGICAL PLANT GEOGRAPHY Biology 561 MWF 11:15 12:05 Spring 2018 128 Wilson Hall Robert K. Peet ECOLOGICAL PLANT GEOGRAPHY Objectives: This is a course in the geography of plant biodiversity, vegetation and ecological processes.

More information

Unit 2: Ecology. 3.1 What is Ecology?

Unit 2: Ecology. 3.1 What is Ecology? Unit 2: Ecology 3.1 What is Ecology? Ecologists study environments at different. - Ecology is the study of the interactions among, and between and their. An is an individual living thing, such as an alligator.

More information

Physical Geography: Patterns, Processes, and Interactions, Grade 11, University/College Expectations

Physical Geography: Patterns, Processes, and Interactions, Grade 11, University/College Expectations Geographic Foundations: Space and Systems SSV.01 explain major theories of the origin and internal structure of the earth; Page 1 SSV.02 demonstrate an understanding of the principal features of the earth

More information

A minimalist barcode can identify a specimen whose DNA is degraded

A minimalist barcode can identify a specimen whose DNA is degraded Molecular Ecology Notes (2006) 6, 959 964 doi: 10.1111/j.1471-8286.2006.01470.x Blackwell Publishing Ltd BARCODING A minimalist barcode can identify a specimen whose DNA is degraded MEHRDAD HAJIBABAEI,*

More information

ANIMAL ECOLOGY (A ECL)

ANIMAL ECOLOGY (A ECL) Animal Ecology (A ECL) 1 ANIMAL ECOLOGY (A ECL) Courses primarily for undergraduates: A ECL 312: Ecology (Cross-listed with BIOL, ENSCI). (3-3) Cr. 4. SS. Prereq: BIOL 211, BIOL 211L, BIOL 212, and BIOL

More information

NOTES: CH 4 Ecosystems & Communities

NOTES: CH 4 Ecosystems & Communities NOTES: CH 4 Ecosystems & Communities 4.1 - Weather & Climate: WEATHER = day-to-day conditions of Earth s atmosphere CLIMATE= refers to average conditions over long periods; defined by year-afteryear patterns

More information

UNIT 5: ECOLOGY Chapter 15: The Biosphere

UNIT 5: ECOLOGY Chapter 15: The Biosphere CORNELL NOTES Directions: You must create a minimum of 5 questions in this column per page (average). Use these to study your notes and prepare for tests and quizzes. Notes will be stamped after each assigned

More information

Edexcel GCSE Geography A

Edexcel GCSE Geography A Edexcel GCSE Comparing the 2012 AQA GCSE specification with the new 2016 Edexcel specification This document is designed to help you compare the existing 2012 AQA GCSE specification (9030) with the new

More information

Introduction. Ecology is the scientific study of the interactions between organisms and their environment.

Introduction. Ecology is the scientific study of the interactions between organisms and their environment. Introduction Ecology is the scientific study of the interactions between organisms and their environment. 1. The interactions between organisms and their environments determine the distribution and abundance

More information

OCR (A) Biology A-level

OCR (A) Biology A-level OCR (A) Biology A-level Topic 4.2: Biodiversity Notes Biodiversity is the variety of living organisms, over time the variety of life on Earth has become more extensive but now it is being threatened by

More information

AP Environmental Science I. Unit 1-2: Biodiversity & Evolution

AP Environmental Science I. Unit 1-2: Biodiversity & Evolution NOTE/STUDY GUIDE: Unit 1-2, Biodiversity & Evolution AP Environmental Science I, Mr. Doc Miller, M.Ed. North Central High School Name: ID#: NORTH CENTRAL HIGH SCHOOL NOTE & STUDY GUIDE AP Environmental

More information

Assessing state-wide biodiversity in the Florida Gap analysis project

Assessing state-wide biodiversity in the Florida Gap analysis project University of Nebraska - Lincoln DigitalCommons@University of Nebraska - Lincoln Nebraska Cooperative Fish & Wildlife Research Unit -- Staff Publications Nebraska Cooperative Fish & Wildlife Research Unit

More information

cycle water cycle evaporation condensation the process where water vapor a series of events that happen over and over

cycle water cycle evaporation condensation the process where water vapor a series of events that happen over and over cycle a series of events that happen over and over water cycle evaporation the cycle in which Earth's water moves through the environment process when the heat of the sun changes water on Earth s surface

More information

Ecology. Ecology terminology Biomes Succession Energy flow in ecosystems Loss of energy in a food chain

Ecology. Ecology terminology Biomes Succession Energy flow in ecosystems Loss of energy in a food chain Ecology Ecology terminology Biomes Succession Energy flow in ecosystems Loss of energy in a food chain Terminology Ecology- the study of the interactions of living organisms with one another and with their

More information

Honors Biology Unit 5 Chapter 34 THE BIOSPHERE: AN INTRODUCTION TO EARTH S DIVERSE ENVIRONMENTS

Honors Biology Unit 5 Chapter 34 THE BIOSPHERE: AN INTRODUCTION TO EARTH S DIVERSE ENVIRONMENTS Honors Biology Unit 5 Chapter 34 THE BIOSPHERE: AN INTRODUCTION TO EARTH S DIVERSE ENVIRONMENTS 1. aquatic biomes photic zone aphotic zone 2. 9 terrestrial (land) biomes tropical rain forest savannah (tropical

More information

Environmental Science

Environmental Science Environmental Science A Study of Interrelationships Cui Jiansheng Hebei University of Science and Technology CH06 Kinds of Ecosystems and Communities Chapter Objectives After reading this chapter, you

More information

Global Biogeography. Natural Vegetation. Structure and Life-Forms of Plants. Terrestrial Ecosystems-The Biomes

Global Biogeography. Natural Vegetation. Structure and Life-Forms of Plants. Terrestrial Ecosystems-The Biomes Global Biogeography Natural Vegetation Structure and Life-Forms of Plants Terrestrial Ecosystems-The Biomes Natural Vegetation natural vegetation is the plant cover that develops with little or no human

More information

SEEA Experimental Ecosystem Accounting

SEEA Experimental Ecosystem Accounting SEEA Experimental Ecosystem Accounting Sokol Vako United Nations Statistics Division Training for the worldwide implementation of the System of Environmental Economic Accounting 2012 - Central Framework

More information

environment Biotic Abiotic

environment Biotic Abiotic 1 Ecology is the study of the living world and the interactions among organisms and where they live; it is the study of interactions between living (animals, plants) and nonliving (earth, air, sun water)

More information

GENERAL ECOLOGY STUDY NOTES

GENERAL ECOLOGY STUDY NOTES 1.0 INTRODUCTION GENERAL ECOLOGY STUDY NOTES A community is made up of populations of different organisms living together in a unit environment. The manner in which these organisms relate together for

More information

Describe how ecosystems recover from a disturbance. Compare succession after a natural disturbance with succession after a human-caused disturbance.

Describe how ecosystems recover from a disturbance. Compare succession after a natural disturbance with succession after a human-caused disturbance. 1 2 Objectives Describe how ecosystems recover from a disturbance. Compare succession after a natural disturbance with succession after a human-caused disturbance. 3 Succesion Cartoon Guide to the Environment

More information

Texas Wildland Fire Season Outlook. for. Winter 2009

Texas Wildland Fire Season Outlook. for. Winter 2009 Texas Wildland Fire Season Outlook for Winter 2009 December 5, 2008 Contents Section Page Executive Summary 1 Introduction 2 Underlying Fuels Condition 4 Weather Outlook 8 Findings 10 Credits 11 Attachments

More information

Through their research, geographers gather a great deal of data about Canada.

Through their research, geographers gather a great deal of data about Canada. Ecozones What is an Ecozone? Through their research, geographers gather a great deal of data about Canada. To make sense of this information, they often organize and group areas with similar features.

More information

The area on and near the Earth s surface where living things exist. The biosphere:

The area on and near the Earth s surface where living things exist. The biosphere: The area on and near the Earth s surface where living things exist The biosphere: The Biosphere If you use an apple to model the world, which part of the apple would represent the biosphere? Today define:

More information

Stamp Area. Biology - Note Packet #55. Major Climate Change ( ) What are some causes of major changes (or disruptions) in an ecosystem?

Stamp Area. Biology - Note Packet #55. Major Climate Change ( ) What are some causes of major changes (or disruptions) in an ecosystem? Name: Mr. LaFranca s - Period Date: Aim: How do ecosystems change over time? Do Now: In I Am Legend, Will Smith s character is the last man in an abandoned NYC. Why do you think grass is overtaking (growing

More information

Ants in the Heart of Borneo a unique possibility to join taxonomy, ecology and conservation

Ants in the Heart of Borneo a unique possibility to join taxonomy, ecology and conservation Ants in the Heart of Borneo a unique possibility to join taxonomy, ecology and conservation Carsten Brühl, University Landau, Germany 1 Borneo Interior mountain ranges of Central Borneo represent the only

More information

Climate Change and Biomes

Climate Change and Biomes Climate Change and Biomes Key Concepts: Greenhouse Gas WHAT YOU WILL LEARN Biome Climate zone Greenhouse gases 1. You will learn the difference between weather and climate. 2. You will analyze how climate

More information

Chapter 02 Life on Land. Multiple Choice Questions

Chapter 02 Life on Land. Multiple Choice Questions Ecology: Concepts and Applications 7th Edition Test Bank Molles Download link all chapters TEST BANK for Ecology: Concepts and Applications 7th Edition by Manuel Molles https://testbankreal.com/download/ecology-concepts-applications-7thedition-test-bank-molles/

More information

AP Biology. Environmental factors. Earth s biomes. Marine. Tropical rainforest. Savanna. Desert. Abiotic factors. Biotic factors

AP Biology. Environmental factors. Earth s biomes. Marine. Tropical rainforest. Savanna. Desert. Abiotic factors. Biotic factors Earth s biomes Environmental factors Abiotic factors non-living chemical & physical factors temperature light water nutrients Biotic factors living components animals plants Marine Tropical rainforest

More information

Evolution & Biodiversity: Origins, Niches, & Adaptation

Evolution & Biodiversity: Origins, Niches, & Adaptation Evolution & Biodiversity: Origins, Niches, & Adaptation tutorial by Paul Rich Outline 1. Life on Earth prokaryotes vs. eukaryotes; six kingdoms 2. Origins of Life chemical evolution, early life, fossils

More information

Title of the Project: Distribution and population status of Arkansas bumble bees

Title of the Project: Distribution and population status of Arkansas bumble bees Title of the Project: Distribution and population status of Arkansas bumble bees Project Summary: The goal of this project is to determine the distribution and population status of bumble bees in Arkansas.

More information

TUNDRA. Column 1 biome name Column 2 biome description Column 3 examples of plant adaptations

TUNDRA. Column 1 biome name Column 2 biome description Column 3 examples of plant adaptations Biome Cards (pp. 1 of 7) Cut out each biome card and divide each card into three sections. Place all sections in a plastic storage bag. Have one bag for every two students. Column 1 biome name Column 2

More information

Ontario Science & Technology - Grade 1

Ontario Science & Technology - Grade 1 Ontario Science & Technology - Grade 1 Characteristics and Needs of Living Things Demonstrate an understanding of the basic needs of and plants Investigate the characteristics and needs of and plants Demonstrate

More information

Living Things and the Environment

Living Things and the Environment Unit 21.1 Living Things and the Environment Section 21.1 Organisms obtain food, water, shelter, and other things it needs to live, grow, and reproduce from its environment. An environment that provides

More information

Ecology. Bio Sphere. Feeding Relationships

Ecology. Bio Sphere. Feeding Relationships Ecology Bio Sphere Feeding Relationships with a whole lot of other creatures Ecology Putting it all together study of interactions between creatures & their environment, because Everything is connected

More information

1. List the steps of the scientific method in order:.

1. List the steps of the scientific method in order:. Name: Period: Biology: 1 st Semester Final Review Scientific Method, Tools of Science 1. List the steps of the scientific method in order:. 2. The use of the five senses to gather data is called:. 3. A

More information

Zoogeographic Regions. Reflective of the general distribution of energy and richness of food chemistry

Zoogeographic Regions. Reflective of the general distribution of energy and richness of food chemistry Terrestrial Flora & Fauna Part II In short, the animal and vegetable lines, diverging widely above, join below in a loop. 1 Asa Gray Zoogeographic Regions Reflective of the general distribution of energy

More information

THREAT CATEGORIES Level 1 Level 1 Level 2 Level 2 Level 3 Level Residential development. Commercial and.

THREAT CATEGORIES Level 1 Level 1 Level 2 Level 2 Level 3 Level Residential development. Commercial and. NJ's 2015 SWAP Update THREAT CATEGORIES Level 1 Level 1 Focal Threat Assessment: MAMMALS 1. Residential commercial development 1.1 Housing urban areas 1.1.1 L conversion from nat'l habitat to urban & other

More information

Chapter 6 Test: Species Interactions and Community Ecology

Chapter 6 Test: Species Interactions and Community Ecology ! Chapter 6 Test: Species Interactions and Community Ecology Graph and Figure Interpretation Questions Use the accompanying figure to answer the following questions. 1) What does the diagram illustrate?

More information

Chapter 5 Evolution of Biodiversity

Chapter 5 Evolution of Biodiversity Chapter 5 Evolution of Biodiversity Biodiversity What is biodiversity? How does evolution occur? What is an ecological niche? Earth is Home to a Tremendous Diversity of Species Ecosystem diversity the

More information

A Small Migrating Herd. Mapping Wildlife Distribution 1. Mapping Wildlife Distribution 2. Conservation & Reserve Management

A Small Migrating Herd. Mapping Wildlife Distribution 1. Mapping Wildlife Distribution 2. Conservation & Reserve Management A Basic Introduction to Wildlife Mapping & Modeling ~~~~~~~~~~ Rev. Ronald J. Wasowski, C.S.C. Associate Professor of Environmental Science University of Portland Portland, Oregon 8 December 2015 Introduction

More information

CSO Climate Data Rescue Project Formal Statistics Liaison Group June 12th, 2018

CSO Climate Data Rescue Project Formal Statistics Liaison Group June 12th, 2018 CSO Climate Data Rescue Project Formal Statistics Liaison Group June 12th, 2018 Dimitri Cernize and Paul McElvaney Environment Statistics and Accounts Presentation Structure Background to Data Rescue Project

More information

Ch.5 Evolution and Community Ecology How do organisms become so well suited to their environment? Evolution and Natural Selection

Ch.5 Evolution and Community Ecology How do organisms become so well suited to their environment? Evolution and Natural Selection Ch.5 Evolution and Community Ecology How do organisms become so well suited to their environment? Evolution and Natural Selection Gene: A sequence of DNA that codes for a particular trait Gene pool: All

More information

Development Team. Department of Zoology, University of Delhi. Department of Zoology, University of Delhi

Development Team. Department of Zoology, University of Delhi. Department of Zoology, University of Delhi Paper No. : 12 Module : 18 diversity index, abundance, species richness, vertical and horizontal Development Team Principal Investigator: Co-Principal Investigator: Paper Coordinator: Content Writer: Content

More information