Other resources. Greengenes (bacterial) Silva (bacteria, archaeal and eukarya)
|
|
- Kenneth Norman
- 6 years ago
- Views:
Transcription
1 General QIIME resources Blog (news, updates): Support/forum: Citing QIIME: Caporaso, J.G. et al., QIIME allows analysis of high-throughput community sequencing data. Nature methods 7(5), (2010) Citing tools used in QIIME: Other useful QIIME papers: Navas-Molina et al., Chapter Nineteen Advancing Our Understanding of the Human Microbiome Using QIIME. Microbial Metagenomics, Metatranscriptomics, and Metaproteomics. Methods in Enzymology. Volume 531, Pages (2013) Data Files and other resources:
2 Other resources Unifrac: Usearch Phyloseq R package: Databases: Greengenes (bacterial) Silva (bacteria, archaeal and eukarya) RDP (bacterial, archael, fungi)
3
4
5 OTU picking (clustering) QIIME has three methods for OTU picking (de novo, closed-reference, and open reference) Sequences grouped together based on sequence identity (de novo) or alignment to reference sequence (closed = reads discarded if they don t match a reference, open = reads that don t hit reference form de novo cluster) QIIME recommends open reference OTU picking for large datasets For de novo OTU picking % identity usually set at >97%, then OTU assumed to represent species Full length 16S rrna gene sequences: 97% cutoff for 16S rrna gene defined by Stackebrandt & Goebel, Reviewed and refined by others, latest being Kim et al IJSEM 64, % 16S rrna gene sequence similarity can be used as the threshold for differentiating two species this is based on full length 16S rrna sequence and cannot be directly extrapolated to microbial community NGS studies.
6 = recommended New method: Uparse: Edgar, R.C. (2013) UPARSE: Highly accurate OTU sequences from microbial amplicon reads, Nature Methods [Pubmed: , dx.doi.org/ /nmeth.2604].
7 Pros and cons of OTU picking approaches Also see:
8
9 Important ecological concepts How is biodiversity defined and measured? Component of biodiversity: RICHNESS Relative abundance EVENNESS Species richness: number of different species in a habitat/sample Species relative abundance: number of each species relative to total number of all species in a sample (number of reads per OTU in a sample relative to total number of reads in that sample) Species evenness: how close in numbers each species in an environment are; distribution Simple example: Richness: 2 species Relative abundance: 5/10 = 0.5 or 50% 5/10 = 0.5 or 50% Even Richness: 2 species Relative abundance: 2/7 = 0.29 or 29% 5/7 = 0.71 or 71% Uneven
10 Alpha diversity: Diversity within a habitat unit (/sample) (Alone) Beta diversity: Diversity Between units/samples Gamma diversity: total diversity in a landscape SIMPLE EXAMPLE Beta X vs Y: 3 shared species Beta Y vs Z: 1 shared species 6 unique species 7 unique species Species richness X Y Alpha: 9 species Alpha: 4 species Alpha: 5 species Z Beta X vs Z: 2 shared species 9 unique species Gamma: Total of 12 unique species (modified from PROBLEM: Doesn t take abundance of each species OR relatedness of each species into account
11 Species richness X Y Z Size adjusted according to abundance X Y Z Phylogenetic relationship Metrics used to describe diversity measure different aspects of the community
12 Note this list is outdated (QIIME1.7)
13 Alpha metrics: estimate richness Observed species: count of unique OTUs in a sample Chao1: how likely it is there are more undiscovered species Sobs = number of species in the sample, F1 = number of singletons (number of species appear once in the sample) F2 = is the number of doubletons (number of species appear twice in the sample). Central concept is that if rare species (singletons) are still being discovered when sampling a community then there is probably more rare species yet to be found. If all species have been found at least twice (doubleton) then it is less likely new species still to be discovered. Both measure richness (number of species) Richness does not take the abundances of the types into account, it is not the same thing as diversity May be useful for judging completeness of sampling, i.e. is sample size/sequencing depth enough to capture all species? see rarefaction
14 Alpha metrics: estimate diversity Shannon diversity index: Shannon-Weaver, Shannon-Wiener, or Shannon Index Complicated computation: Information Theory (other metric use this: Brillouin Indices) Shannon Diversity index (H) characterizes species diversity and accounts for abundance and evenness of the species. Shannon equitability index (EH) is a measure of evenness. If S is the number of observed species, then EH = H/ln (S) Simpson diversity index (1-D): Simpsons Diversity Index = 1-D, Value between 0 and 1. 0 = no diversity, 1 = infinite diversity Simpson Index: D = Σ n(n-1) N (N-1) Species Total (n) n-1 n(n-1) N= total number of individuals of all species, A n = total number of individuals for each species B C D Example: D = Σ n(n-1) = 254 = 254/702 = 0.36 Total 27 = N 254 = Σ n(n-1) N (N-1) 27(27-1) 1-D = = 0.64 Simpsons reciprocal = 1/D Probability of 2 individuals being conspecifics if drawn randomly from an infinitely large community Simple computation: measures species dominance (weighted towards abundance of most common species) (other metrics: McIntosh, and Berger-Parker) Total species richness is downweighed relative to evenness Both indices estimate diversity (richness, abundance and evenness) Simpson diversity less sensitive to richness and more sensitive to evenness than Shannon diversity
15 Alpha metrics: estimate phylogenetic PD/PD_whole_tree: diversity Faith s Phylogenetic Diversity (PD) minimum total branch length of the phylogenetic tree that incorporates all OTUs in a sample Not weighted for abundance PD weighted for abundance see
16 Rarefaction 50 individuals 2 species 250 individuals 4 species 500 individuals Collector s curves NGS: individuals = reads Evaluate sample size: is sequencing depth enough? Comparing the richness and diversity observed in different samples Note rarefaction is not the same as rarefying 8 species Felix Borner / AP
17 Alpha diversity rarefaction QIIME tutorial and some examples MC / Cardenas PA, Cooper PJ, Cox MJ, Chico M, Arias C, et al. (2012) Upper Airways Microbiota in Antibiotic-Naıve Wheezing and Healthy Infants from the Tropics of Rural Ecuador. PLoS ONE 7(10): e doi: /journal.p
18 = recommended
19 Non-phylogenetic beta diversity: Bray Curtis dissimilarity: based on species abundance or count data 0 < BC > 1 0 = identical, two sites have all the same species 1 = two sites do not share any species NB: Not a distance Jaccard index: dissimilarity measure for presence absence data (species present or absent)
20 Phylogenetic beta diversity: UniFrac distance UNIFRAC help:
21 UNIFRAC help: Raw unweighted Unifrac: sum of branch length that is unique to one environment or the other l i is the branch length between node i and its parent, and Ai and Bi are indicators equal to 0 or 1 as descendants of node i are absent or present in communities A and B respectively A = red, B= blue, branches in common are purple, branches unique to A are red and unique to B are blue. Presence/absence metric. Raw weighted Unifrac: Branch lengths are weighted by the relative abundance of sequences Normalised weighted Unifrac: takes abundance and normalises branch length Rapidly evolving lineages (with long branch length can skew unifrac)
22 Unifrac distance matrix and clustering UNIFRAC help:
23 UPGMA clustering of Unifrac distance matrix Unweighted Pair Group Method with Arithmetic Mean constructs a rooted tree or dendrogram that reflects the structure present in a pairwise distance matrix (or similarity matrix); in this case Unifrac distance matrix simple bottom-up agglomerative hierarchical clustering method: nearest two samples are merged into a new higher-level cluster, distance between new cluster and remaining samples calculated, repeat until all samples are clustered Great step-by-step explanation at: (Dr Edwards, University Southampton) assumes a constant rate of evolution (equal rates of mutation) UNIFRAC help:
24 Image from
25 Principal Coordinate Analysis (PCoA or PCO) Also sometimes called classical MDS (multidimensional scaling) Can use any distance matrix (must obey triangle inequality), in this case the Unifrac distance matrix Assumes linear relation represent distance between samples graphically in multidimensional space (n-1 dimension, n = number samples) A new set of reduced variables is derived from the original distances and used to scale samples Samples now represented on 2D or 3D plot with these new variables as axes and the relationship between the sample on the plot should reflect their underlying distance Ordinates data on plot so that axis 1 (PC1) explains the greatest amount of variance, axis 2 (PC2) explains the next greatest amount of variance, etc.
26 Nescent_qiime_tutorial_june2012.pdf
27 Five control samples are all red and the four Fast samples are all blue. This lets you easily visualize clustering by metadata category. The 3d visualization software allows you to rotate the axes to see the data from different perspectives. Metadata categories as they appeared in the columns in your mapping file
28 Jackknife: assess confidence in nodes of UPGMA tree and in PCA Choose smaller number of sequences randomly from each sample Make UPGMA tree from this subset of sequences Compare with the UPGMA tree made from all the sequences This process is repeated (default: 10x) with many random subsets of sequences, and the tree nodes which appear more consistently across subsets have higher support red for % support, yellow for 50-75%, green for 25-50%, and blue for < 25% support
29 The jackknifed replicate PCoA plots can be compared to assess the degree of variation from one replicate to the next. QIIME displays this variation by displaying confidence ellipsoids around the samples represented in a PCoA plot.
30 Communities clustered using PCoA of the unweighted UniFrac distance matrix Science 18 December 2009: Vol. 326 no pp DOI: /science
31 OTU Heatmap Classification of OTUs Samples OTUs
32 Rows (OTUs): Ordered by OTU Phylogenetic tree Columns (samples): ordered by UPGMA tree not currently implemented directly in QIIME use other software such as R
33 R package: phyloseq Phyloseq: McMurdie PJ, Holmes S (2013) phyloseq: An R Package for Reproducible Interactive Analysis and Graphics of Microbiome Census Data. PLoS ONE 8(4): e doi: /journal.pone Improved OTU heatmap visualizations can be generated using the plot_heatmap() command in the phyloseq package for R Could use ordination rather than hierarchical clustering to order samples
34 Examples: papers that use other hierarchical clustering to order samples in heatmap
35
36
37 Note on Rarefying beta_diversity_through_plots.py -i otu_table.biom -o bdiv_even100/ -t rep_set.tre -m Fasting_Map.txt -e 100 Note: don t confuse rarefying with rarefaction Rarefaction: sample without replacement at many different sequencing depths, alpha diversity, statistically valid Rarefying: library size normalization by random subsampling without replacement attempt to normalize by selecting same number of sequences from each sample, not statistically valid. E.g. have 100 reads for sample A and from sample B, then take only 100 reads from sample B to normalize. QIIME Describe method for taking different sequencing depth of samples into account without removing data (integrated into R package phyloseq)
38 Other QIIME visuals: OTU network Visual representation of shared OTUs and unique OTUs Red circles = sample White square = OTU Green = fasting Blue = control Core set of OTUs that differentiate fasting from control?
39 Other QIIME analyses: Procrustes Analysis compare UniFrac PCoA plots generated by two different processing pipelines, different 16S variable regions, different sequencing technologies, repeated samples
40 Which 16S database? Databases may differ in: - greengenes (archael, bacterial), Silva (archael, bacterial, eukaryotic), RDP (archael, bacterial) - Coverage or number of sequences, quality - Taxonomic classification of sequences - Frequency of updating - Compatibility of data with choice of analysis platform Options? 1. choose same database throughout your study that is compatible with the tools you are using AND/OR same database used in other studies if you want to compare 2. Compare analysis using different databases 3. Database of well curated/classified sequences specific to your environment(?)
Lecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)
Lecture 2: Diversity, Distances, adonis Lecture 2: Diversity, Distances, adonis Diversity - alpha, beta (, gamma) Beta- Diversity in practice: Ecological Distances Unsupervised Learning: Clustering, etc
More informationTaxonomy and Clustering of SSU rrna Tags. Susan Huse Josephine Bay Paul Center August 5, 2013
Taxonomy and Clustering of SSU rrna Tags Susan Huse Josephine Bay Paul Center August 5, 2013 Primary Methods of Taxonomic Assignment Bayesian Kmer Matching RDP http://rdp.cme.msu.edu Wang, et al (2007)
More informationTitle ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 Title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses
More informationMicrobiome: 16S rrna Sequencing 3/30/2018
Microbiome: 16S rrna Sequencing 3/30/2018 Skills from Previous Lectures Central Dogma of Biology Lecture 3: Genetics and Genomics Lecture 4: Microarrays Lecture 12: ChIP-Seq Phylogenetics Lecture 13: Phylogenetics
More informationAmplicon Sequencing. Dr. Orla O Sullivan SIRG Research Fellow Teagasc
Amplicon Sequencing Dr. Orla O Sullivan SIRG Research Fellow Teagasc What is Amplicon Sequencing? Sequencing of target genes (are regions of ) obtained by PCR using gene specific primers. Why do we do
More informationOutline Classes of diversity measures. Species Divergence and the Measurement of Microbial Diversity. How do we describe and compare diversity?
Species Divergence and the Measurement of Microbial Diversity Cathy Lozupone University of Colorado, Boulder. Washington University, St Louis. Outline Classes of diversity measures α vs β diversity Quantitative
More informationFIG S1: Rarefaction analysis of observed richness within Drosophila. All calculations were
Page 1 of 14 FIG S1: Rarefaction analysis of observed richness within Drosophila. All calculations were performed using mothur (2). OTUs were defined at the 3% divergence threshold using the average neighbor
More informationLecture: Mixture Models for Microbiome data
Lecture: Mixture Models for Microbiome data Lecture 3: Mixture Models for Microbiome data Outline: - - Sequencing thought experiment Mixture Models (tangent) - (esp. Negative Binomial) - Differential abundance
More informationChad Burrus April 6, 2010
Chad Burrus April 6, 2010 1 Background What is UniFrac? Materials and Methods Results Discussion Questions 2 The vast majority of microbes cannot be cultured with current methods Only half (26) out of
More informationNature Biotechnology: doi: /nbt Supplementary Figure 1. Detailed overview of the primer-free full-length SSU rrna library preparation.
Supplementary Figure 1 Detailed overview of the primer-free full-length SSU rrna library preparation. Detailed overview of the primer-free full-length SSU rrna library preparation. Supplementary Figure
More informationLecture 3: Mixture Models for Microbiome data. Lecture 3: Mixture Models for Microbiome data
Lecture 3: Mixture Models for Microbiome data 1 Lecture 3: Mixture Models for Microbiome data Outline: - Mixture Models (Negative Binomial) - DESeq2 / Don t Rarefy. Ever. 2 Hypothesis Tests - reminder
More informationFlowchart. (b) (c) (d)
Flowchart (c) (b) (d) This workflow consists of the following steps: alpha diversity (microbial community evenness and richness) d1) Generate rarefied OTU tables (mulbple_rarefacbons.py) d2) Compute measures
More informationMultivariate Statistics 101. Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis
Multivariate Statistics 101 Ordination (PCA, NMDS, CA) Cluster Analysis (UPGMA, Ward s) Canonical Correspondence Analysis Multivariate Statistics 101 Copy of slides and exercises PAST software download
More informationSupplementary Materials for
advances.sciencemag.org/cgi/content/full/2/1/e1500997/dc1 Supplementary Materials for Social behavior shapes the chimpanzee pan-microbiome Andrew H. Moeller, Steffen Foerster, Michael L. Wilson, Anne E.
More informationSupplementary Information
Supplementary Information Table S1. Per-sample sequences, observed OTUs, richness estimates, diversity indices and coverage. Samples codes as follows: YED (Young leaves Endophytes), MED (Mature leaves
More informationProbing diversity in a hidden world: applications of NGS in microbial ecology
Probing diversity in a hidden world: applications of NGS in microbial ecology Guus Roeselers TNO, Microbiology & Systems Biology Group Symposium on Next Generation Sequencing October 21, 2013 Royal Museum
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationAn introduction to the picante package
An introduction to the picante package Steven Kembel (steve.kembel@gmail.com) April 2010 Contents 1 Installing picante 1 2 Data formats in picante 2 2.1 Phylogenies................................ 2 2.2
More informationSUPPLEMENTARY INFORMATION
Supplementary information S1 (box). Supplementary Methods description. Prokaryotic Genome Database Archaeal and bacterial genome sequences were downloaded from the NCBI FTP site (ftp://ftp.ncbi.nlm.nih.gov/genomes/all/)
More informationTaxonomical Classification using:
Taxonomical Classification using: Extracting ecological signal from noise: introduction to tools for the analysis of NGS data from microbial communities Bergen, April 19-20 2012 INTRODUCTION Taxonomical
More informationCharacterizing and predicting cyanobacterial blooms in an 8-year
1 2 3 4 5 Characterizing and predicting cyanobacterial blooms in an 8-year amplicon sequencing time-course Authors Nicolas Tromas 1*, Nathalie Fortin 2, Larbi Bedrani 1, Yves Terrat 1, Pedro Cardoso 4,
More informationWhat is the range of a taxon? A scaling problem at three levels: Spa9al scale Phylogene9c depth Time
What is the range of a taxon? A scaling problem at three levels: Spa9al scale Phylogene9c depth Time 1 5 0.25 0.15 5 0.05 0.05 0.10 2 0.10 0.10 0.20 4 Reminder of what a range-weighted tree is Actual Tree
More informationHow to quantify biological diversity: taxonomical, functional and evolutionary aspects. Hanna Tuomisto, University of Turku
How to quantify biological diversity: taxonomical, functional and evolutionary aspects Hanna Tuomisto, University of Turku Why quantify biological diversity? understanding the structure and function of
More informationLecture 2: Descriptive statistics, normalizations & testing
Lecture 2: Descriptive statistics, normalizations & testing From sequences to OTU table Sequencing Sample 1 Sample 2... Sample N Abundances of each microbial taxon in each of the N samples 2 1 Normalizing
More informationThe Effect of Primer Choice and Short Read Sequences on the Outcome of 16S rrna Gene Based Diversity Studies
The Effect of Primer Choice and Short Read Sequences on the Outcome of 16S rrna Gene Based Diversity Studies Jonas Ghyselinck 1 *., Stefan Pfeiffer 2 *., Kim Heylen 1, Angela Sessitsch 2, Paul De Vos 1
More informationCensusing the Sea in the 21 st Century
Censusing the Sea in the 21 st Century Nancy Knowlton & Matthieu Leray Photo: Ove Hoegh-Guldberg Smithsonian s National Museum of Natural History Estimates of Marine/Reef Species Numbers (Millions) Marine
More informationLecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis
Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially
More informationSupplemental Online Results:
Supplemental Online Results: Functional, phylogenetic, and computational determinants of prediction accuracy using reference genomes A series of tests determined the relationship between PICRUSt s prediction
More informationSupplementary Information
Supplementary Information Altitudinal patterns of diversity and functional traits of metabolically active microorganisms in stream biofilms Linda Wilhelm 1, Katharina Besemer 2, Lena Fragner 3, Hannes
More informationAn Automated Phylogenetic Tree-Based Small Subunit rrna Taxonomy and Alignment Pipeline (STAP)
An Automated Phylogenetic Tree-Based Small Subunit rrna Taxonomy and Alignment Pipeline (STAP) Dongying Wu 1 *, Amber Hartman 1,6, Naomi Ward 4,5, Jonathan A. Eisen 1,2,3 1 UC Davis Genome Center, University
More informationMicrobial analysis with STAMP
Microbial analysis with STAMP Conor Meehan cmeehan@itg.be A quick aside on who I am Tangents already! Who I am A postdoc at the Institute of Tropical Medicine in Antwerp, Belgium Mycobacteria evolution
More informationStudying the effect of species dominance on diversity patterns using Hill numbers-based indices
Studying the effect of species dominance on diversity patterns using Hill numbers-based indices Loïc Chalmandrier Loïc Chalmandrier Diversity pattern analysis November 8th 2017 1 / 14 Introduction Diversity
More informationLecture 5: Ecological distance metrics; Principal Coordinates Analysis. Univariate testing vs. community analysis
Lecture 5: Ecological distance metrics; Principal Coordinates Analysis Univariate testing vs. community analysis Univariate testing deals with hypotheses concerning individual taxa Is this taxon differentially
More informationMicrobiota: Its Evolution and Essence. Hsin-Jung Joyce Wu "Microbiota and man: the story about us
Microbiota: Its Evolution and Essence Overview q Define microbiota q Learn the tool q Ecological and evolutionary forces in shaping gut microbiota q Gut microbiota versus free-living microbe communities
More informationAssigning Taxonomy to Marker Genes. Susan Huse Brown University August 7, 2014
Assigning Taxonomy to Marker Genes Susan Huse Brown University August 7, 2014 In a nutshell Taxonomy is assigned by comparing your DNA sequences against a database of DNA sequences from known taxa Marker
More informationBIO 682 Multivariate Statistics Spring 2008
BIO 682 Multivariate Statistics Spring 2008 Steve Shuster http://www4.nau.edu/shustercourses/bio682/index.htm Lecture 11 Properties of Community Data Gauch 1982, Causton 1988, Jongman 1995 a. Qualitative:
More informationMiGA: The Microbial Genome Atlas
December 12 th 2017 MiGA: The Microbial Genome Atlas Jim Cole Center for Microbial Ecology Dept. of Plant, Soil & Microbial Sciences Michigan State University East Lansing, Michigan U.S.A. Where I m From
More informationBINF6201/8201. Molecular phylogenetic methods
BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics
More informationdiversity(datamatrix, index= shannon, base=exp(1))
Tutorial 11: Diversity, Indicator Species Analysis, Cluster Analysis Calculating Diversity Indices The vegan package contains the command diversity() for calculating Shannon and Simpson diversity indices.
More informationPhylogenetic diversity and conservation
Phylogenetic diversity and conservation Dan Faith The Australian Museum Applied ecology and human dimensions in biological conservation Biota Program/ FAPESP Nov. 9-10, 2009 BioGENESIS Providing an evolutionary
More informationExploring Microbes in the Sea. Alma Parada Postdoctoral Scholar Stanford University
Exploring Microbes in the Sea Alma Parada Postdoctoral Scholar Stanford University Cruising the ocean to get us some microbes It s all about the Microbe! Microbes = microorganisms an organism that requires
More information-Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the
1 2 3 -Principal components analysis is by far the oldest multivariate technique, dating back to the early 1900's; ecologists have used PCA since the 1950's. -PCA is based on covariance or correlation
More informationDeciphering the Enigma of Undetected Species, Phylogenetic, and Functional Diversity. Based on Good-Turing Theory
Metadata S1 Deciphering the Enigma of Undetected Species, Phylogenetic, and Functional Diversity Based on Good-Turing Theory Anne Chao, Chun-Huo Chiu, Robert K. Colwell, Luiz Fernando S. Magnago, Robin
More informationAlgorithms in Bioinformatics
Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods
More informationUsing Topological Data Analysis to find discrimination between microbial states in human microbiome data
Using Topological Data Analysis to find discrimination between microbial states in human microbiome data Mehrdad Yazdani 1,2, Larry Smarr 1,3 and Rob Knight 4 1 California Institute for Telecommunications
More informationBacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria
Bacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria Seminar presentation Pierre Barbera Supervised by:
More informationDistance Measures. Objectives: Discuss Distance Measures Illustrate Distance Measures
Distance Measures Objectives: Discuss Distance Measures Illustrate Distance Measures Quantifying Data Similarity Multivariate Analyses Re-map the data from Real World Space to Multi-variate Space Distance
More informationSUPPLEMENTARY INFORMATION
Supplementary information S3 (box) Methods Methods Genome weighting The currently available collection of archaeal and bacterial genomes has a highly biased distribution of isolates across taxa. For example,
More informationComparative Genomics II
Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods
More informationSampling e ects on beta diversity
Introduction Methods Results Conclusions Sampling e ects on beta diversity Ben Bolker, Adrian Stier, Craig Osenberg McMaster University, Mathematics & Statistics and Biology UBC, Zoology University of
More informationMultiple Sequence Alignment. Sequences
Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe
More informationOverview of clustering analysis. Yuehua Cui
Overview of clustering analysis Yuehua Cui Email: cuiy@msu.edu http://www.stt.msu.edu/~cui A data set with clear cluster structure How would you design an algorithm for finding the three clusters in this
More informationRobert Edgar. Independent scientist
Robert Edgar Independent scientist robert@drive5.com www.drive5.com "Bacterial taxonomy is a hornets nest that no one, really, wants to get into." Referee #1, UTAX paper Assume prokaryotic species meaningful
More informationIntroduction to multivariate analysis Outline
Introduction to multivariate analysis Outline Why do a multivariate analysis Ordination, classification, model fitting Principal component analysis Discriminant analysis, quickly Species presence/absence
More informationLDM Package. 1 Overview. Yi-Juan Hu and Glen A. Satten March 19, 2018
LDM Package Yi-Juan Hu and Glen A. Satten March 19, 2018 1 Overview The LDM package implements the Linear Decomposition Model (Hu and Satten 2018), which provides a single analysis path that includes distance-based
More informationBem Vindo. Amazonian Biodiversity and Systematics in Brazil.
Bem Vindo Amazonian Biodiversity and Systematics in Brazil. John W. Wenzel Director, Center for Biodiversity and Ecosystems Carnegie Museum of Natural History Pittsburgh, PA. 1800: Alexander von Humbolt
More informationDETECTING BIOLOGICAL AND ENVIRONMENTAL CHANGES: DESIGN AND ANALYSIS OF MONITORING AND EXPERIMENTS (University of Bologna, 3-14 March 2008)
Dipartimento di Biologia Evoluzionistica Sperimentale Centro Interdipartimentale di Ricerca per le Scienze Ambientali in Ravenna INTERNATIONAL WINTER SCHOOL UNIVERSITY OF BOLOGNA DETECTING BIOLOGICAL AND
More informationThe biogenesis-atbc2012 Training workshop "Evolutionary Approaches to Biodiversity Science" June 2012, Bonito, Brazil
The biogenesis-atbc2012 Training workshop "Evolutionary Approaches to Biodiversity Science" 16-18 June 2012, Bonito, Brazil Phylogenetic and functional diversity (including PD) and phylogenetic conservation
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationDIMENSION REDUCTION AND CLUSTER ANALYSIS
DIMENSION REDUCTION AND CLUSTER ANALYSIS EECS 833, 6 March 2006 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and resources available at http://people.ku.edu/~gbohling/eecs833
More informationSupplementary Information
Supplementary Information For the article"comparable system-level organization of Archaea and ukaryotes" by J. Podani, Z. N. Oltvai, H. Jeong, B. Tombor, A.-L. Barabási, and. Szathmáry (reference numbers
More informationAmira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut
Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological
More informationMicrobes usually have few distinguishing properties that relate them, so a hierarchical taxonomy mainly has not been possible.
Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional
More informationMicrobial Taxonomy. Slowly evolving molecules (e.g., rrna) used for large-scale structure; "fast- clock" molecules for fine-structure.
Microbial Taxonomy Traditional taxonomy or the classification through identification and nomenclature of microbes, both "prokaryote" and eukaryote, has been in a mess we were stuck with it for traditional
More informationTaxonomy. Content. How to determine & classify a species. Phylogeny and evolution
Taxonomy Content Why Taxonomy? How to determine & classify a species Domains versus Kingdoms Phylogeny and evolution Why Taxonomy? Classification Arrangement in groups or taxa (taxon = group) Nomenclature
More informationVariations in pelagic bacterial communities in the North Atlantic Ocean coincide with water bodies
The following supplement accompanies the article Variations in pelagic bacterial communities in the North Atlantic Ocean coincide with water bodies Richard L. Hahnke 1, Christina Probian 1, Bernhard M.
More informationCarlo Vittorio Cannistraci. Minimum Curvilinear Embedding unveils nonlinear patterns in 16S metagenomic data
Carlo Vittorio Cannistraci Minimum Curvilinear Embedding unveils nonlinear patterns in 16S metagenomic data Biomedical Cybernetics Group Biotechnology Center (BIOTEC) Technische Universität Dresden (TUD)
More informationComputational approaches for functional genomics
Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding
More informationStochastic calculus for summable processes 1
Stochastic calculus for summable processes 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a
More informationMultivariate Analysis of Ecological Data
Multivariate Analysis of Ecological Data MICHAEL GREENACRE Professor of Statistics at the Pompeu Fabra University in Barcelona, Spain RAUL PRIMICERIO Associate Professor of Ecology, Evolutionary Biology
More informationPhylogenetic trees 07/10/13
Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share
More informationDr. Amira A. AL-Hosary
Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological
More informationBiological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor
Biological Networks:,, and via Relative Description Length By: Tamir Tuller & Benny Chor Presented by: Noga Grebla Content of the presentation Presenting the goals of the research Reviewing basic terms
More informationBiology 211 (2) Week 1 KEY!
Biology 211 (2) Week 1 KEY Chapter 1 KEY FIGURES: 1.2, 1.3, 1.4, 1.5, 1.6, 1.7 VOCABULARY: Adaptation: a trait that increases the fitness Cells: a developed, system bound with a thin outer layer made of
More informationSupplementary Information
Supplementary Information Supplementary Figure 1. Schematic pipeline for single-cell genome assembly, cleaning and annotation. a. The assembly process was optimized to account for multiple cells putatively
More informationA (short) introduction to phylogenetics
A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field
More informationSequence Analysis '17- lecture 8. Multiple sequence alignment
Sequence Analysis '17- lecture 8 Multiple sequence alignment Ex5 explanation How many random database search scores have e-values 10? (Answer: 10!) Why? e-value of x = m*p(s x), where m is the database
More informationMicrobial Taxonomy and the Evolution of Diversity
19 Microbial Taxonomy and the Evolution of Diversity Copyright McGraw-Hill Global Education Holdings, LLC. Permission required for reproduction or display. 1 Taxonomy Introduction to Microbial Taxonomy
More informationconcentration ( mol l -1 )
concentration ( mol l -1 ) 8 10 0 20 40 60 80 100 120 140 160 180 methane sulfide ammonium oxygen sulfate (/10) b depth (m) 12 14 Supplementary Figure 1. Water column parameters from August 2011. Chemical
More informationFocus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations.
Previously Focus was on solving matrix inversion problems Now we look at other properties of matrices Useful when A represents a transformations y = Ax Or A simply represents data Notion of eigenvectors,
More informationPhylogenetic Tree Reconstruction
I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven
More informationCS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003
CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1 Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003 Lecturer: Wing-Kin Sung Scribe: Ning K., Shan T., Xiang
More informationIntroduction to microbiota data analysis
Introduction to microbiota data analysis Natalie Knox, PhD Head Bacterial Genomics, Bioinformatics Core National Microbiology Laboratory, Public Health Agency of Canada 2 National Microbiology Laboratory
More informationBAT Biodiversity Assessment Tools, an R package for the measurement and estimation of alpha and beta taxon, phylogenetic and functional diversity
Methods in Ecology and Evolution 2015, 6, 232 236 doi: 10.1111/2041-210X.12310 APPLICATION BAT Biodiversity Assessment Tools, an R package for the measurement and estimation of alpha and beta taxon, phylogenetic
More informationA Bayesian taxonomic classification method for 16S rrna gene sequences with improved species-level accuracy
Gao et al. BMC Bioinformatics (2017) 18:247 DOI 10.1186/s12859-017-1670-4 SOFTWARE Open Access A Bayesian taxonomic classification method for 16S rrna gene sequences with improved species-level accuracy
More informationHandling Fungal data in MoBeDAC
Handling Fungal data in MoBeDAC Jason Stajich UC Riverside Fungal Taxonomy and naming undergoing a revolution One fungus, one name http://www.biology.duke.edu/fungi/ mycolab/primers.htm http://www.biology.duke.edu/fungi/
More informationPhylogeny 9/8/2014. Evolutionary Relationships. Data Supporting Phylogeny. Chapter 26
Phylogeny Chapter 26 Taxonomy Taxonomy: ordered division of organisms into categories based on a set of characteristics used to assess similarities and differences Carolus Linnaeus developed binomial nomenclature,
More informationClusters. Unsupervised Learning. Luc Anselin. Copyright 2017 by Luc Anselin, All Rights Reserved
Clusters Unsupervised Learning Luc Anselin http://spatial.uchicago.edu 1 curse of dimensionality principal components multidimensional scaling classical clustering methods 2 Curse of Dimensionality 3 Curse
More informationPalaeontological community and diversity analysis brief notes. Oyvind Hammer Paläontologisches Institut und Museum, Zürich
Palaeontological community and diversity analysis brief notes Oyvind Hammer Paläontologisches Institut und Museum, Zürich ohammer@nhm.uio.no Zürich, June 3, 2002 Contents 1 Introduction 2 2 The basics
More informationInferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT
Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions
More informationH. Pieter J. van Veelen *, Joana Falcao Salles and B. Irene Tieleman
van Veelen et al. Microbiome (2017) 5:156 DOI 10.1186/s40168-017-0371-6 RESEARCH Open Access Multi-level comparisons of cloacal, skin, feather and nest-associated microbiota suggest considerable influence
More informationOrganizing Diversity Taxonomy is the discipline of biology that identifies, names, and classifies organisms according to certain rules.
1 2 3 4 5 6 7 8 9 10 Outline 1.1 Introduction to AP Biology 1.2 Big Idea 1: Evolution 1.3 Big Idea 2: Energy and Molecular Building Blocks 1.4 Big Idea 3: Information Storage, Transmission, and Response
More informationMapping of Science. Bart Thijs ECOOM, K.U.Leuven, Belgium
Mapping of Science Bart Thijs ECOOM, K.U.Leuven, Belgium Introduction Definition: Mapping of Science is the application of powerful statistical tools and analytical techniques to uncover the structure
More informationPhylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz
Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels
More informationUnit 5: Taxonomy. KEY CONCEPT Organisms can be classified based on physical similarities.
KEY CONCEPT Organisms can be classified based on physical similarities. Linnaeus developed the scientific naming system still used today. Taxonomy is the science of naming and classifying organisms. White
More informationSUPPLEMENTARY INFORMATION
City of origin as a confounding variable. The original study was designed such that the city where sampling was performed was perfectly confounded with where the DNA extractions and sequencing was performed.
More informationThe implications of neutral evolution for neutral ecology. Daniel Lawson Bioinformatics and Statistics Scotland Macaulay Institute, Aberdeen
The implications of neutral evolution for neutral ecology Daniel Lawson Bioinformatics and Statistics Scotland Macaulay Institute, Aberdeen How is How is diversity Diversity maintained? maintained? Talk
More informationINTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA
INTRODUCTION TO MULTIVARIATE ANALYSIS OF ECOLOGICAL DATA David Zelený & Ching-Feng Li INTRODUCTION TO MULTIVARIATE ANALYSIS Ecologial similarity similarity and distance indices Gradient analysis regression,
More information2/19/2018. Dataset: 85,122 islands 19,392 > 1km 2 17,883 with data
The group numbers are arbitrary. Remember that you can rotate dendrograms around any node and not change the meaning. So, the order of the clusters is not meaningful. Taking a subset of the data changes
More informationPOPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics
POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the
More information