An Adaptive Association Test for Microbiome Data

Similar documents
Testing for group differences in brain functional connectivity

Microbiota: Its Evolution and Essence. Hsin-Jung Joyce Wu "Microbiota and man: the story about us

A (short) introduction to phylogenetics

Phylogenetic trees 07/10/13

Lecture 7: Interaction Analysis. Summer Institute in Statistical Genetics 2017

Outline Classes of diversity measures. Species Divergence and the Measurement of Microbial Diversity. How do we describe and compare diversity?

Two-sample testing with high-dimensional genetic and neuroimaging data

Latent Variable Methods for the Analysis of Genomic Data

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Niche Modeling. STAMPS - MBL Course Woods Hole, MA - August 9, 2016

Microbial Taxonomy and the Evolution of Diversity

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Computational Systems Biology: Biology X

Homework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:

SUPPLEMENTARY INFORMATION

Interaction networks shed light on the ecology and evolution of soil microbiomes. Linda Kinkel Department of Plant Pathology University of Minnesota

Lecture 9: Kernel (Variance Component) Tests and Omnibus Tests for Rare Variants. Summer Institute in Statistical Genetics 2017

Statistical methods for the analysis of microbiome compositional data in HIV studies

Constructing Evolutionary/Phylogenetic Trees

A. Incorrect! In the binomial naming convention the Kingdom is not part of the name.

Whole Genome Alignment. Adam Phillippy University of Maryland, Fall 2012

. Also, in this case, p i = N1 ) T, (2) where. I γ C N(N 2 2 F + N1 2 Q)

Sociology 362 Data Exercise 6 Logistic Regression 2

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Dr. Amira A. AL-Hosary

Bayesian Hierarchical Models

Reproduction- passing genetic information to the next generation

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Package milineage. October 20, 2017

Probing diversity in a hidden world: applications of NGS in microbial ecology

Chapter 26 Phylogeny and the Tree of Life

Phylogeny: building the tree of life

High-Throughput Sequencing Course

arxiv: v1 [stat.ap] 23 May 2013

Asymptotic distribution of the largest eigenvalue with application to genetic data

Introduction to QTL mapping in model organisms

Introduction to QTL mapping in model organisms

Power and sample size calculations for designing rare variant sequencing association studies.

Phylogenetic inference

Marginal Screening and Post-Selection Inference

Title ghost-tree: creating hybrid-gene phylogenetic trees for diversity analyses

Modern Evolutionary Classification. Section 18-2 pgs

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

Statistical issues in QTL mapping in mice

Lecture 2: Diversity, Distances, adonis. Lecture 2: Diversity, Distances, adonis. Alpha- Diversity. Alpha diversity definition(s)

MODEL-FREE LINKAGE AND ASSOCIATION MAPPING OF COMPLEX TRAITS USING QUANTITATIVE ENDOPHENOTYPES

Variable selection and machine learning methods in causal inference

Prerequisite: STATS 7 or STATS 8 or AP90 or (STATS 120A and STATS 120B and STATS 120C). AP90 with a minimum score of 3

BINF6201/8201. Molecular phylogenetic methods

Harvard University. A Note on the Control Function Approach with an Instrumental Variable and a Binary Outcome. Eric Tchetgen Tchetgen

Previous lecture. Single variant association. Use genome-wide SNPs to account for confounding (population substructure)

Statistics in medicine

Case-Control Association Testing. Case-Control Association Testing

"PRINCIPLES OF PHYLOGENETICS: ECOLOGY AND EVOLUTION" Integrative Biology 200 Spring 2018 University of California, Berkeley

FACTORIZATION MACHINES AS A TOOL FOR HEALTHCARE CASE STUDY ON TYPE 2 DIABETES DETECTION

Binomial Mixture Model-based Association Tests under Genetic Heterogeneity

Relationship between Genomic Distance-Based Regression and Kernel Machine Regression for Multi-marker Association Testing

Molecular Evolution & Phylogenetics Traits, phylogenies, evolutionary models and divergence time between sequences

Multiple Sequence Alignment. Sequences

Biology 211 (2) Week 1 KEY!

Chapters AP Biology Objectives. Objectives: You should know...

Constructing Evolutionary/Phylogenetic Trees

FIG S1: Rarefaction analysis of observed richness within Drosophila. All calculations were

Lecture: Mixture Models for Microbiome data

Personalized Treatment Selection Based on Randomized Clinical Trials. Tianxi Cai Department of Biostatistics Harvard School of Public Health

Supervised Learning to Predict Geographic Origin of Human Metagenomic Samples

The practice of naming and classifying organisms is called taxonomy.

Fig. 26.7a. Biodiversity. 1. Course Outline Outcomes Instructors Text Grading. 2. Course Syllabus. Fig. 26.7b Table

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

II. Deep insight into plant habitats

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

BTRY 4830/6830: Quantitative Genomics and Genetics

Curriculum Links. AQA GCE Biology. AS level

Compositional data methods for microbiome studies

Mapping QTL to a phylogenetic tree

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Association Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5

8/23/2014. Phylogeny and the Tree of Life

Mapping multiple QTL in experimental crosses

Classification and Viruses Practice Test

How should we organize the diversity of animal life?

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Anatomy of a tree. clade is group of organisms with a shared ancestor. a monophyletic group shares a single common ancestor = tapirs-rhinos-horses

Tests for the Odds Ratio in a Matched Case-Control Design with a Quantitative X

A phylogenomic toolbox for assembling the tree of life

Istituto di Microbiologia. Università Cattolica del Sacro Cuore, Roma. Gut Microbiota assessment and the Meta-HIT program.

Lecture 3: Mixture Models for Microbiome data. Lecture 3: Mixture Models for Microbiome data

Bacterial Communities in Women with Bacterial Vaginosis: High Resolution Phylogenetic Analyses Reveal Relationships of Microbiota to Clinical Criteria

Multiple QTL mapping

1. T/F: Genetic variation leads to evolution. 2. What is genetic equilibrium? 3. What is speciation? How does it occur?

Scientific names allow scientists to talk about particular species without confusion

Bayesian Multivariate Logistic Regression

Insights in Life Sciences Micro-module Proposal Form

2.1 Linear regression with matrices

Nemours Biomedical Research Biostatistics Core Statistics Course Session 4. Li Xie March 4, 2015

Some of these slides have been borrowed from Dr. Paul Lewis, Dr. Joe Felsenstein. Thanks!

A strategy for modelling count data which may have extra zeros

Marginal versus conditional effects: does it make a difference? Mireille Schnitzer, PhD Université de Montréal

Transcription:

An Adaptive Association Test for Microbiome Data Chong Wu 1, Jun Chen 2, Junghi 1 Kim and Wei Pan 1 1 Division of Biostatistics, School of Public Health, University of Minnesota; 2 Division of Biomedical Statistics and Informatics, Mayo Clinic Aug. 4 th, 2016 2016 JSM Chong Wu 1 / 13

Table of Contents 1 Background 2 Microbiome Based Sum of Powered Score Tests 3 Numerical Examples Simulation Results Application to a Gut Microbiome Data 2016 JSM Chong Wu 2 / 13

Human Microbiome Data Human Microbiome Data Microbe: Tiny living organism, such as bacterium, fungus, or virus Microbiome: The genomes of human microbes; the way they interact with the human host Why human microbiome is important? More than 10 times the number of microbes lives in the human body than cells. Play an important part in our overall health. 2016 JSM Chong Wu 3 / 13

Human microbiome association studies Human microbiome association studies Seen as an extended human genome Detect an association of the human microbiome diversity with a phenotype of interest Improve our understanding of the non-genetic component of complex traits and diseases Goal Testing the association between the whole microbiome composition and the outcome of interest 2016 JSM Chong Wu 4 / 13

Feature The Feature of The Human Microbiome Data Operational taxonomic units (OTUs): surrogates for biological taxa High dimensional: the number of OTUs are usually much larger than the sample size Overdispersion: most OTUs are rare Phylogenetic Tree A branching diagram or tree showing the inferred evolutionary relationships among various biological species 2016 JSM Chong Wu 5 / 13

Outline Outline Null hypothesis: No association between any taxa and the outcome OTU Root (common ancestor) 1 2 3 4 5! b 6! b 7 1 0 0 2 0 1 6 7 2 2 5 0 0 1! b 1! b 2! b 3! b! b 3 0 0 0 1 0 4 5 1 2 3 4 5 4 1 0 0 0 3 OTU1 OTU2 OTU3 OTU4 OTU5 Sample Taxon proportion:! p = Z Z ik ik j=1 ij q! p = 2/(1+2) 0.7 13! p = (2+5)/(2+5+1) 0.9 26 Generalized taxon proportion:! Q u = b ik k I( p ik > 0);!!!Q w ik = b k p e.g. ik! Q u = b 13 3 ;!Q w 13 = 0.7b 3 Test Statistics: n m γ U = ( Y i ˆµ i,0 )Q ii ;!!T MiSPU(γ ) = U k ;!!T amispuu = min γ Γ P MiSPU(γ )! i=1 k=1 Residual Permutation P value 2016 JSM Chong Wu 6 / 13

Outline MiSPU Generalized taxon proportion Q ik : Q w ik = b kp ki ; Q u ik = b ki(p ki ) Raw weighted UniFrac distance is exactly the same as the L 1 distance of the Q w ik For a binary outcome, we use a logistic regression model: Logit[Pr(Y i = 1)] = β 0 + β X i + m Q ik φ k k=1 H 0 : φ = (φ 1,..., φ m ) = 0; that is, there is no association between any taxa and the outcome of interest 2016 JSM Chong Wu 7 / 13

Outline MiSPU Score: MiSPU test statistic: U = n (Y i ˆµ i,0 )Q i i=1 T MiSPU(γ) = w U = m k=1 Use permutation scheme (Pan et al., 2014) to calculate the p value U γ k 2016 JSM Chong Wu 8 / 13

Outline The choice of γ As γ goes to infinity, we have T MiSPU( ) U = m max k=1 U k Intuition in the choice of γ: the more sparse the signals, the larger γ if (most) associations in one direction, the use an odd γ In practice, how to choose γ? 2016 JSM Chong Wu 9 / 13

amispu amispu Choose the one giving the most significant p value Use an adaptive test idea (Pan et al., 2014) amispu test statistic: T amispu = min γ Γ P MiSPU(γ). Use permutation or parametric bootstrap to estimate its p value 2016 JSM Chong Wu 10 / 13

Simulation Results Simulation Results 0.75 Power 0.50 Methods K(0) K(0.5) Kopt MiSPU(2) MiSPU(3) MiSPU( ) amispu X,Z Independent Adjustment for X 0.25 0.0 0.5 1.0 1.5 2.0 Effect β 2016 JSM Chong Wu 11 / 13

Application to a Gut Microbiome Data Real Data: Gut Microbiome Diet strongly affects human health, partly by modulating gut microbiome composition In one cross-sectional study, 98 healthy volunteers were enrolled and habitual long-term diet information was collected using food frequency questionnaire Original study failed to detect the gender effect (p value 0.080) Increasing evidence suggests that there is sex difference in the human gut microbiome (Bolnick et al., 2014) Our new method can detect it (p value 0.0058) 2016 JSM Chong Wu 12 / 13

Application to a Gut Microbiome Data Real Data: Gut Microbiome A taxon in Bacteroides explains more than 90% relative contributions Top 4 taxa all come from the Bacteroides Gender status is likely associated with Bacteroides, but independent with other enterotypes 2016 JSM Chong Wu 13 / 13