Explore SNP polymorphism data. A. Dereeper, Y. Hueber
|
|
- Alban Thornton
- 5 years ago
- Views:
Transcription
1 Explore SNP polymorphism data A. Dereeper, Y. Hueber Bioinformatics trainings, Supagro, February, 2016
2 Tablet Graphical tool to visualize assemblies Accept many formats ACE, SAM, BAM
3 GATK (Genome Analysis ToolKit) So>ware package to analyse NGS data. Implemented to analyse human resequencing data, for medical purpose (1000 genomes, The Cancer Genome Atlas) Includes depth analyses, quality score recalibralon, SNP/InDel deteclon Complementary with other packages: SamTools, PicardTools, VCFtools, BEDtools PREPROCESS: * Index human genome (Picard), we used HG18 from UCSC. * Convert Illumina reads to Fastq format * Convert Illumina 1.6 read quality scores to standard Sanger scores FOR EACH SAMPLE: 1. Align samples to genome (BWA), generates SAI files. 2. Convert SAI to SAM (BWA) 3. Convert SAM to BAM binary format (SAM Tools) 4. Sort BAM (SAM Tools) 5. Index BAM (SAM Tools) 6. Identify target regions for realignment (Genome Analysis Toolkit) 7. Realign BAM to get better Indel calling (Genome Analysis Toolkit) 8. Reindex the realigned BAM (SAM Tools) 9. Call Indels (Genome Analysis Toolkit) 10. Call SNPs (Genome Analysis Toolkit) 11. View aligned reads in BAM/BAI (Integrated Genome Viewer)
4 Fastq (RC1) Fastq (RC2) Fastq (RC3) Fastq (RC4) Cutadapt Mapping BWA Cutadapt Cutadapt Cutadapt Mapping BWA Mapping BWA Mapping BWA. Add or Replace Groups Add or Replace Groups Add or Replace Groups Add or Replace Groups BAM with read group BAM with read group BAM with read group BAM with read group mergesam Global BAM with read group VCF file
5 For GBS data Tassel pipeline Version 5 TASSEL- GBS Plos One, 2014
6 GBS RAD- Seq RNA- Seq WGRS Reads pre- processing and mapping + SNP Calling and genotype assignalon Tassel Galaxy workflow Genotyping data Storage and mining Genotyping data analyses and visualizalon (GWAS, diversity )
7 Format VCF (Variant Call Format) Advantages: VariaLon descriplon for each posilon + genotype assignalons Indexed flat files. Binary files also exist: BCF format
8 Format Pileup - Another format for variant calling (generated by samtools) - Describe alignment row by row (not line by line like in SAM format) - Used by so>wares such as Varscan (varscan pileup2snp) - Frequently used for rare variants, with a low frequency (e.g. viral pop)
9 Autres Other GATK fonctionalités functionalities GATK Module DepthOfCoverage: Allows to get sequencing depth for each gene, each posilon and each individual Module ReadBackedPhasing: Allows to set, if possible, associalons between alleles (phase and haplotypes) when we are in an heterozygote situalon. Et non AGG GGA
10 Haplotypes and phasing Haplotype: Specific groups of genes or alleles that progeny inherited from one parent Phasing: DeterminaLon of haplotype phase. Process of stalslcal eslmalon of haplotypes from genotype data. Can be infered by stalslcs methods using non- ambigous haplotypes present in the dataset (Gevalt, ShapeIT, Phase) Can be resolved using physical associalon of alleles within the reads (GATK ReadBackedPhasing, GATK HaplotypeCaller)
11 Projet Gigwa, pour la gestion des données massives de variants (GBS, RADSeq, WGRS) «With NGS arise serious computalonal challenges in terms of storage, search, sharing, analysis, and data visualizalon, that redefine some praclces in data management.» - Based on NoSQL technology - Handles VCF files (Variant Call Format) and annotalons - Supports mullple variant types: SNPs, InDels, SSRs, SV - Powerful genotyping queries - Easily scalable with MongoDB sharding - Transparent access - Takes phasing informalon into account when imporlng/exporlng in VCF format
12 hep://gigwa.southgreen.fr/gigwa/
13 SNiPlay: Web application for polymorphism analyses hep://sniplay.southgreen.fr
14 IFB project Galaxy4Sniplay (WP4 IFB, Plant node)
15 Upload a VCF file in SNiPlay hep://sniplay.southgreen.fr Upload a VCF file (+ reference if not available in genome collec@on) Select rice genome The reference corresponce to mrna A. Dereeper, Y. Hueber Bioinformatics trainings, Supagro, February, 2016
16 Filters using VCFtools or Gigwa Maf Missing data AnnotaLon PosiLon
17 SNP annotation using SnpEff It annotates and predicts the effects of variants on genes (amino acid changes ) Uses as input GFF annoalon file and VCF
18
19 structure (snmf, Admixture) Test different values of K (es.mates the probability (likelihood tests) that samples are structured in K popula.ons) For the best value of K, the applicalon shows Q eslmates for each individual (admixture percent) (probability that the individual belongs to each popula.on)
20 MDS (MulL- Dimensional Scale) plot SNP- based Distance tree with FastME
21 Pi: NucleoLde diversity: Average number of nucleolde differences per site between any two DNA sequences chosen randomly from the sample populalon Used to measure the degree of polymorphism within a populalon Diversity analysis Comparison between individuals Fst: FixaLon index: measure of populalon differenlalon due to genelc structure.
22 SNP density by individuals can allow the deteclon of introgression event. Introgression = Movement of a exogene region (gene flow) from one species into the gene pool of another by the repeated backcrossing of an interspecific hybrid with one of its parent species Widely used in agronomy obtained but can occurs naturally
23 Haplotypes Haplotype reconstruclon using Gevalt Network with Haplophyle Available only for regions presenlng few variants (short regions, genes) Exploit phased VCF (in progress ) High frequency haplotypes Group distribution whithin this haplotype Low frequency haplotype Distance between 2 haplotypes (nb of mutations)
24 GWAS (Genome-Wide Association Studies) EsLmate associalon between a marker and a phenotypic character Manhaean plots: displays GWAS stalslcal tests (- log10 pvalue) along chromosomes TASSEL, MLMM sofwares False posilves because of the studied structuralon panel => correclon using structure populalon et and kinship
25 GWAS issues Choice of genotypic panel: phenotypic diversity for target traits must be sufficient (core- colleclon, MAGIC lines, NAM ) Popula@on structure induces high rates of false associalons (false posilves) CorrecLon using structure populalon et and kinship. Mixed models: o Q o K (widely used) o Q+K (widely used) Density of markers must be enough to provide a good genome cover. Density can be also highly variable. Linkage disequilibrium (LD) landscape: level of intra- and inter- chromosomal LD (number of loci in LD with loci from other chromosomes). Ideally, LD profile must be flat to avoid distorsion in associalon paeerns.
26 TD: Study of root characters using GWAS in Oryza sativa japonica. Influence of a correction using structure and kinship
27 Relatedness between individuals (kinship matrix) TASSEL and plink so>wares EsLmaLon of relatedness between individuals using a distance matrix
Accounting for read depth in the analysis of genotyping-by-sequencing data
Accounting for read depth in the analysis of genotyping-by-sequencing data Ken Dodds, John McEwan, Timothy Bilton, Rudi Brauning, Rayna Anderson, Tracey Van Stijn, Theodor Kristjánsson, Shannon Clarke
More informationBTRY 7210: Topics in Quantitative Genomics and Genetics
BTRY 7210: Topics in Quantitative Genomics and Genetics Jason Mezey Biological Statistics and Computational Biology (BSCB) Department of Genetic Medicine jgm45@cornell.edu February 12, 2015 Lecture 3:
More informationIntroduction to PLINK H3ABionet Course Covenant University, Nigeria
UNIVERSITY OF THE WITWATERSRAND, JOHANNESBURG Introduction to PLINK H3ABionet Course Covenant University, Nigeria Scott Hazelhurst H3ABioNet funded by NHGRI grant number U41HG006941 Wits Bioinformatics
More information(Genome-wide) association analysis
(Genome-wide) association analysis 1 Key concepts Mapping QTL by association relies on linkage disequilibrium in the population; LD can be caused by close linkage between a QTL and marker (= good) or by
More informationVariant visualisation and quality control
Variant visualisation and quality control You really should be making plots! 25/06/14 Paul Theodor Pyl 1 Classical Sequencing Example DNA.BAM.VCF Aligner Variant Caller A single sample sequencing run 25/06/14
More informationSupplementary Methods and Figures
Whole-genome resequencing of honeybee drones to detect genomic selection in a population managed for royal jelly David Wragg 1*, Maria Marti 1, Benjamin Basso 2, Jean-Pierre Bidanel 3, Emmanuelle Labarthe
More informationGenetic diversity and population structure in rice. S. Kresovich 1,2 and T. Tai 3,5. Plant Breeding Dept, Cornell University, Ithaca, NY
Genetic diversity and population structure in rice S. McCouch 1, A. Garris 1,2, J. Edwards 1, H. Lu 1,3 M Redus 4, J. Coburn 1, N. Rutger 4, S. Kresovich 1,2 and T. Tai 3,5 1 Plant Breeding Dept, Cornell
More information1 Springer. Nan M. Laird Christoph Lange. The Fundamentals of Modern Statistical Genetics
1 Springer Nan M. Laird Christoph Lange The Fundamentals of Modern Statistical Genetics 1 Introduction to Statistical Genetics and Background in Molecular Genetics 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
More informationProcesses of Evolution
15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection
More informationMicrosatellite data analysis. Tomáš Fér & Filip Kolář
Microsatellite data analysis Tomáš Fér & Filip Kolář Multilocus data dominant heterozygotes and homozygotes cannot be distinguished binary biallelic data (fragments) presence (dominant allele/heterozygote)
More informationSta$s$cal Physics, Inference and Applica$ons to Biology
Sta$s$cal Physics, Inference and Applica$ons to Biology Physics Department, Ecole Normale Superieure, Paris, France. Simona Cocco Office:GH301 mail:cocco@lps.ens.fr Deriving Protein Structure and Func$on
More informationGenotype Imputation. Biostatistics 666
Genotype Imputation Biostatistics 666 Previously Hidden Markov Models for Relative Pairs Linkage analysis using affected sibling pairs Estimation of pairwise relationships Identity-by-Descent Relatives
More informationFei Lu. Post doctoral Associate Cornell University
Fei Lu Post doctoral Associate Cornell University http://www.maizegenetics.net Genotyping by sequencing (GBS) is simple and cost effective 1. Digest DNA 2. Ligate adapters with barcodes 3. Pool DNAs 4.
More informationPredictive Genome Analysis Using Partial DNA Sequencing Data
Predictive Genome Analysis Using Partial DNA Sequencing Data Nauman Ahmed, Koen Bertels and Zaid Al-Ars Computer Engineering Lab, Delft University of Technology, Delft, The Netherlands {n.ahmed, k.l.m.bertels,
More informationBayesian Inference of Interactions and Associations
Bayesian Inference of Interactions and Associations Jun Liu Department of Statistics Harvard University http://www.fas.harvard.edu/~junliu Based on collaborations with Yu Zhang, Jing Zhang, Yuan Yuan,
More informationComparative Genomics of Fagaceae
Fagaceae Images.google.com Linkage Map www.quia.com TM www.clipartlord.com Selection of mapping parents SM2 SM1 Predominant pollinator? Progeny Exclusion for Full Sib Linkage Mapping Year Acorns genotyped
More informationChapter 6 Linkage Disequilibrium & Gene Mapping (Recombination)
12/5/14 Chapter 6 Linkage Disequilibrium & Gene Mapping (Recombination) Linkage Disequilibrium Genealogical Interpretation of LD Association Mapping 1 Linkage and Recombination v linkage equilibrium ²
More informationLinear Regression (1/1/17)
STA613/CBB540: Statistical methods in computational biology Linear Regression (1/1/17) Lecturer: Barbara Engelhardt Scribe: Ethan Hada 1. Linear regression 1.1. Linear regression basics. Linear regression
More informationopulation genetics undamentals for SNP datasets
opulation genetics undamentals for SNP datasets with crocodiles) Sam Banks Charles Darwin University sam.banks@cdu.edu.au I ve got a SNP genotype dataset, now what? Do my data meet the requirements of
More information1.5.1 ESTIMATION OF HAPLOTYPE FREQUENCIES:
.5. ESTIMATION OF HAPLOTYPE FREQUENCIES: Chapter - 8 For SNPs, alleles A j,b j at locus j there are 4 haplotypes: A A, A B, B A and B B frequencies q,q,q 3,q 4. Assume HWE at haplotype level. Only the
More informationThe phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.
Series 2: Cross Diagrams - Complementation There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome:
More informationLecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium. November 12, 2012
Lecture 22: Signatures of Selection and Introduction to Linkage Disequilibrium November 12, 2012 Last Time Sequence data and quantification of variation Infinite sites model Nucleotide diversity (π) Sequence-based
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans and Stacey Cherny University of Oxford Wellcome Trust Centre for Human Genetics This Session IBD vs IBS Why is IBD important? Calculating IBD probabilities
More informationLearning gene regulatory networks Statistical methods for haplotype inference Part I
Learning gene regulatory networks Statistical methods for haplotype inference Part I Input: Measurement of mrn levels of all genes from microarray or rna sequencing Samples (e.g. 200 patients with lung
More informationHigh-throughput sequencing: Alignment and related topic
High-throughput sequencing: Alignment and related topic Simon Anders EMBL Heidelberg HTS Platforms E s ta b lis h e d p la tfo rm s Illu m in a H is e q, A B I S O L id, R o c h e 4 5 4 N e w c o m e rs
More informationCalculation of IBD probabilities
Calculation of IBD probabilities David Evans University of Bristol This Session Identity by Descent (IBD) vs Identity by state (IBS) Why is IBD important? Calculating IBD probabilities Lander-Green Algorithm
More informationAssociation Testing with Quantitative Traits: Common and Rare Variants. Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5
Association Testing with Quantitative Traits: Common and Rare Variants Timothy Thornton and Katie Kerr Summer Institute in Statistical Genetics 2014 Module 10 Lecture 5 1 / 41 Introduction to Quantitative
More informationGENETICS - CLUTCH CH.1 INTRODUCTION TO GENETICS.
!! www.clutchprep.com CONCEPT: HISTORY OF GENETICS The earliest use of genetics was through of plants and animals (8000-1000 B.C.) Selective breeding (artificial selection) is the process of breeding organisms
More informationLecture 2: Genetic Association Testing with Quantitative Traits. Summer Institute in Statistical Genetics 2017
Lecture 2: Genetic Association Testing with Quantitative Traits Instructors: Timothy Thornton and Michael Wu Summer Institute in Statistical Genetics 2017 1 / 29 Introduction to Quantitative Trait Mapping
More informationResearch Statement on Statistics Jun Zhang
Research Statement on Statistics Jun Zhang (junzhang@galton.uchicago.edu) My interest on statistics generally includes machine learning and statistical genetics. My recent work focus on detection and interpretation
More informationEvolution of phenotypic traits
Quantitative genetics Evolution of phenotypic traits Very few phenotypic traits are controlled by one locus, as in our previous discussion of genetics and evolution Quantitative genetics considers characters
More informationLearning ancestral genetic processes using nonparametric Bayesian models
Learning ancestral genetic processes using nonparametric Bayesian models Kyung-Ah Sohn October 31, 2011 Committee Members: Eric P. Xing, Chair Zoubin Ghahramani Russell Schwartz Kathryn Roeder Matthew
More informationBrowsing Genomic Information with Ensembl Plants
Browsing Genomic Information with Ensembl Plants Etienne de Villiers, PhD (Adapted from slides by Bert Overduin EMBL-EBI) Outline of workshop Brief introduction to Ensembl Plants History Content Tutorial
More informationLecture 9. QTL Mapping 2: Outbred Populations
Lecture 9 QTL Mapping 2: Outbred Populations Bruce Walsh. Aug 2004. Royal Veterinary and Agricultural University, Denmark The major difference between QTL analysis using inbred-line crosses vs. outbred
More informationPackage KMgene. November 22, 2017
Type Package Package KMgene November 22, 2017 Title Gene-Based Association Analysis for Complex Traits Version 1.2 Author Qi Yan Maintainer Qi Yan Gene based association test between a
More informationNew imputation strategies optimized for crop plants: FILLIN (Fast, Inbred Line Library ImputatioN) FSFHap (Full Sib Family Haplotype)
New imputation strategies optimized for crop plants: FILLIN (Fast, Inbred Line Library ImputatioN) FSFHap (Full Sib Family Haplotype) Kelly Swarts PAG Allele Mining 1/11/2014 Imputation is the projection
More informationLecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency
Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency Bruce Walsh lecture notes Introduction to Quantitative Genetics SISG, Seattle 16 18 July 2018 1 Outline Genetics of complex
More informationDetecting selection from differentiation between populations: the FLK and hapflk approach.
Detecting selection from differentiation between populations: the FLK and hapflk approach. Bertrand Servin bservin@toulouse.inra.fr Maria-Ines Fariello, Simon Boitard, Claude Chevalet, Magali SanCristobal,
More informationRNA- seq read mapping
RNA- seq read mapping Pär Engström SciLifeLab RNA- seq workshop October 216 IniDal steps in RNA- seq data processing 1. Quality checks on reads 2. Trim 3' adapters (opdonal (for species with a reference
More informationBIOLOGY STANDARDS BASED RUBRIC
BIOLOGY STANDARDS BASED RUBRIC STUDENTS WILL UNDERSTAND THAT THE FUNDAMENTAL PROCESSES OF ALL LIVING THINGS DEPEND ON A VARIETY OF SPECIALIZED CELL STRUCTURES AND CHEMICAL PROCESSES. First Semester Benchmarks:
More informationProportional Variance Explained by QLT and Statistical Power. Proportional Variance Explained by QTL and Statistical Power
Proportional Variance Explained by QTL and Statistical Power Partitioning the Genetic Variance We previously focused on obtaining variance components of a quantitative trait to determine the proportion
More informationSupplementary Information for Discovery and characterization of indel and point mutations
Supplementary Information for Discovery and characterization of indel and point mutations using DeNovoGear Avinash Ramu 1 Michiel J. Noordam 1 Rachel S. Schwartz 2 Arthur Wuster 3 Matthew E. Hurles 3 Reed
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationLecture WS Evolutionary Genetics Part I 1
Quantitative genetics Quantitative genetics is the study of the inheritance of quantitative/continuous phenotypic traits, like human height and body size, grain colour in winter wheat or beak depth in
More information10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison
10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:
More informationA mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding
Professur Pflanzenzüchtung Professur Pflanzenzüchtung A mixed model based QTL / AM analysis of interactions (G by G, G by E, G by treatment) for plant breeding Jens Léon 4. November 2014, Oulu Workshop
More informationUnit 3 - Molecular Biology & Genetics - Review Packet
Name Date Hour Unit 3 - Molecular Biology & Genetics - Review Packet True / False Questions - Indicate True or False for the following statements. 1. Eye color, hair color and the shape of your ears can
More informationNature Genetics: doi: /ng Supplementary Figure 1. The phenotypes of PI , BR121, and Harosoy under short-day conditions.
Supplementary Figure 1 The phenotypes of PI 159925, BR121, and Harosoy under short-day conditions. (a) Plant height. (b) Number of branches. (c) Average internode length. (d) Number of nodes. (e) Pods
More informationSupplemental Information
Molecular Cell, Volume 52 Supplemental Information The Translational Landscape of the Mammalian Cell Cycle Craig R. Stumpf, Melissa V. Moreno, Adam B. Olshen, Barry S. Taylor, and Davide Ruggero Supplemental
More informationEiji Yamamoto 1,2, Hiroyoshi Iwata 3, Takanari Tanabata 4, Ritsuko Mizobuchi 1, Jun-ichi Yonemaru 1,ToshioYamamoto 1* and Masahiro Yano 5,6
Yamamoto et al. BMC Genetics 2014, 15:50 METHODOLOGY ARTICLE Open Access Effect of advanced intercrossing on genome structure and on the power to detect linked quantitative trait loci in a multi-parent
More informationMOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES
MOLECULAR MAPS AND MARKERS FOR DIPLOID ROSES Patricia E Klein, Mandy Yan, Ellen Young, Jeekin Lau, Stella Kang, Natalie Patterson, Natalie Anderson and David Byrne Department of Horticultural Sciences,
More informationOur typical RNA quantification pipeline
RNA-Seq primer Our typical RNA quantification pipeline Upload your sequence data (fastq) Align to the ribosome (Bow>e) Align remaining reads to genome (TopHat) or transcriptome (RSEM) Make report of quality
More informationClassical Selection, Balancing Selection, and Neutral Mutations
Classical Selection, Balancing Selection, and Neutral Mutations Classical Selection Perspective of the Fate of Mutations All mutations are EITHER beneficial or deleterious o Beneficial mutations are selected
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More information2. Map genetic distance between markers
Chapter 5. Linkage Analysis Linkage is an important tool for the mapping of genetic loci and a method for mapping disease loci. With the availability of numerous DNA markers throughout the human genome,
More informationPopulation genomic scans suggest novel genes underlie convergent flowering time evolution in the introduced range of Arabidopsis thaliana
Molecular Ecology (2017) 26, 92 106 doi: 10.1111/mec.13643 SPECIAL ISSUE: THE MOLECULAR MECHANISMS OF ADAPTATION AND SPECIATION: INTEGRATING GENOMIC AND MOLECULAR APPROACHES Population genomic scans suggest
More informationBayesian inference of ancient human demography from individual genome sequences
Bayesian inference of ancient human demography from individual genome sequences Ilan Gronau, Melissa J. Hubisz, Brad Gulko, Charles G. Danko, Adam Siepel SUPPLEMENTARY INFORMATION Supplementary Figures
More informationFor 5% confidence χ 2 with 1 degree of freedom should exceed 3.841, so there is clear evidence for disequilibrium between S and M.
STAT 550 Howework 6 Anton Amirov 1. This question relates to the same study you saw in Homework-4, by Dr. Arno Motulsky and coworkers, and published in Thompson et al. (1988; Am.J.Hum.Genet, 42, 113-124).
More informationSoyBase, the USDA-ARS Soybean Genetics and Genomics Database
SoyBase, the USDA-ARS Soybean Genetics and Genomics Database David Grant Victoria Carollo Blake Steven B. Cannon Kevin Feeley Rex T. Nelson Nathan Weeks SoyBase Site Map and Navigation Video Tutorials:
More informationHeterozygous BMN lines
Optical density at 80 hours 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 0.8 0.6 0.4 0.2 a YPD b YPD + 1µM nystatin c YPD + 2µM nystatin d YPD + 4µM nystatin 1 3 5 6 9 13 16 20 21 22 23 25 28 29 30
More informationAraport, a community portal for Arabidopsis. Data integration, sharing and reuse. sergio contrino University of Cambridge
Araport, a community portal for Arabidopsis. Data integration, sharing and reuse sergio contrino University of Cambridge Acknowledgements J Craig Venter Institute Chris Town Agnes Chan Vivek Krishnakumar
More informationCycle «Analyse de données de séquençage à haut-débit»
Cycle «Analyse de données de séquençage à haut-débit» Module 1/5 Analyse ADN Chadi Saad CRIStAL - Équipe BONSAI - Univ Lille, CNRS, INRIA (chadi.saad@univ-lille.fr) Présentation de Sophie Gallina (source:
More informationEvaluating allopolyploid origins in strawberries (Fragaria) using haplotypes generated from target capture sequencing
Kamneva et al. BMC Evolutionary Biology (2017) 17:180 DOI 10.1186/s12862-017-1019-7 RESEARCH ARTICLE Evaluating allopolyploid origins in strawberries (Fragaria) using haplotypes generated from target capture
More informationGTRAC FAST R ETRIEVAL FROM C OMPRESSED C OLLECTIONS OF G ENOMIC VARIANTS. Kedar Tatwawadi Mikel Hernaez Idoia Ochoa Tsachy Weissman
GTRAC FAST R ETRIEVAL FROM C OMPRESSED C OLLECTIONS OF G ENOMIC VARIANTS Kedar Tatwawadi Mikel Hernaez Idoia Ochoa Tsachy Weissman Overview Introduction Results Algorithm Details Summary & Further Work
More informationBioinformatics Chapter 1. Introduction
Bioinformatics Chapter 1. Introduction Outline! Biological Data in Digital Symbol Sequences! Genomes Diversity, Size, and Structure! Proteins and Proteomes! On the Information Content of Biological Sequences!
More informationGBS Bioinformatics Pipeline(s) Overview
GBS Bioinformatics Pipeline(s) Overview Getting from sequence files to genotypes. Pipeline Coding: Ed Buckler Jeff Glaubitz James Harriman Presentation: Rob Elshire With supporting information from the
More informationSolutions to Problem Set 4
Question 1 Solutions to 7.014 Problem Set 4 Because you have not read much scientific literature, you decide to study the genetics of garden peas. You have two pure breeding pea strains. One that is tall
More informationMicroevolution (Ch 16) Test Bank
Microevolution (Ch 16) Test Bank Multiple Choice Identify the letter of the choice that best completes the statement or answers the question. 1. Which of the following statements describes what all members
More informationp(d g A,g B )p(g B ), g B
Supplementary Note Marginal effects for two-locus models Here we derive the marginal effect size of the three models given in Figure 1 of the main text. For each model we assume the two loci (A and B)
More informationGraduate Funding Information Center
Graduate Funding Information Center UNC-Chapel Hill, The Graduate School Graduate Student Proposal Sponsor: Program Title: NESCent Graduate Fellowship Department: Biology Funding Type: Fellowship Year:
More informationMATHEMATICAL MODELS - Vol. III - Mathematical Modeling and the Human Genome - Hilary S. Booth MATHEMATICAL MODELING AND THE HUMAN GENOME
MATHEMATICAL MODELING AND THE HUMAN GENOME Hilary S. Booth Australian National University, Australia Keywords: Human genome, DNA, bioinformatics, sequence analysis, evolution. Contents 1. Introduction:
More informationManaging segregating populations
Managing segregating populations Aim of the module At the end of the module, we should be able to: Apply the general principles of managing segregating populations generated from parental crossing; Describe
More informationGenotype Imputation. Class Discussion for January 19, 2016
Genotype Imputation Class Discussion for January 19, 2016 Intuition Patterns of genetic variation in one individual guide our interpretation of the genomes of other individuals Imputation uses previously
More informationGBS Bioinformatics Pipeline(s) Overview
GBS Bioinformatics Pipeline(s) Overview Getting from sequence files to genotypes. Pipeline Coding: Ed Buckler Jeff Glaubitz James Harriman Presentation: Terry Casstevens With supporting information from
More informationHomework Assignment, Evolutionary Systems Biology, Spring Homework Part I: Phylogenetics:
Homework Assignment, Evolutionary Systems Biology, Spring 2009. Homework Part I: Phylogenetics: Introduction. The objective of this assignment is to understand the basics of phylogenetic relationships
More informationLinkage and Linkage Disequilibrium
Linkage and Linkage Disequilibrium Summer Institute in Statistical Genetics 2014 Module 10 Topic 3 Linkage in a simple genetic cross Linkage In the early 1900 s Bateson and Punnet conducted genetic studies
More informationGenotyping By Sequencing (GBS) Method Overview
enotyping By Sequencing (BS) Method Overview RJ Elshire, JC laubitz, Q Sun, JV Harriman ES Buckler, and SE Mitchell http://wwwmaizegeneticsnet/ Topics Presented Background/oals BS lab protocol Illumina
More informationBioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment
Bioinformatics (GLOBEX, Summer 2015) Pairwise sequence alignment Substitution score matrices, PAM, BLOSUM Needleman-Wunsch algorithm (Global) Smith-Waterman algorithm (Local) BLAST (local, heuristic) E-value
More informationLearning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study
Learning Your Identity and Disease from Research Papers: Information Leaks in Genome-Wide Association Study Rui Wang, Yong Li, XiaoFeng Wang, Haixu Tang and Xiaoyong Zhou Indiana University at Bloomington
More informationThe phenotype of this worm is wild type. When both genes are mutant: The phenotype of this worm is double mutant Dpy and Unc phenotype.
Series 1: Cross Diagrams There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome: When both
More informationMultivariate analysis of genetic data an introduction
Multivariate analysis of genetic data an introduction Thibaut Jombart MRC Centre for Outbreak Analysis and Modelling Imperial College London Population genomics in Lausanne 23 Aug 2016 1/25 Outline Multivariate
More informationNature Genetics: doi: /ng Supplementary Figure 1. ssp mutant phenotypes in a functional SP background.
Supplementary Figure 1 ssp mutant phenotypes in a functional SP background. (a,b) Statistical comparisons of primary and sympodial shoot flowering times as determined by mean values for leaf number on
More informationQ1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.
OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall
More informationFriday Harbor From Genetics to GWAS (Genome-wide Association Study) Sept David Fardo
Friday Harbor 2017 From Genetics to GWAS (Genome-wide Association Study) Sept 7 2017 David Fardo Purpose: prepare for tomorrow s tutorial Genetic Variants Quality Control Imputation Association Visualization
More informationIntraspecific gene genealogies: trees grafting into networks
Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation
More informationLecture 28: BLUP and Genomic Selection. Bruce Walsh lecture notes Synbreed course version 11 July 2013
Lecture 28: BLUP and Genomic Selection Bruce Walsh lecture notes Synbreed course version 11 July 2013 1 BLUP Selection The idea behind BLUP selection is very straightforward: An appropriate mixed-model
More informationIntroduction to Linkage Disequilibrium
Introduction to September 10, 2014 Suppose we have two genes on a single chromosome gene A and gene B such that each gene has only two alleles Aalleles : A 1 and A 2 Balleles : B 1 and B 2 Suppose we have
More informationWhen one gene is wild type and the other mutant:
Series 2: Cross Diagrams Linkage Analysis There are two alleles for each trait in a diploid organism In C. elegans gene symbols are ALWAYS italicized. To represent two different genes on the same chromosome:
More informationOn the Fixed Parameter Tractability and Approximability of the Minimum Error Correction problem
On the Fixed Parameter Tractability and Approximability of the Minimum Error Correction problem Paola Bonizzoni, Riccardo Dondi, Gunnar W. Klau, Yuri Pirola, Nadia Pisanti and Simone Zaccaria DISCo, computer
More informationComputational Genomics. Systems biology. Putting it together: Data integration using graphical models
02-710 Computational Genomics Systems biology Putting it together: Data integration using graphical models High throughput data So far in this class we discussed several different types of high throughput
More informationGenetic and physiological approach to elucidation of Cd absorption mechanism by rice plants
Genetic and physiological approach to elucidation of Cd absorption mechanism by rice plants Satoru Ishikawa National Institute for Agro-Environmental Sciences, 3-1-3, Kannondai, Tsukuba, Ibaraki, 305-8604,
More informationPrinciples of QTL Mapping. M.Imtiaz
Principles of QTL Mapping M.Imtiaz Introduction Definitions of terminology Reasons for QTL mapping Principles of QTL mapping Requirements For QTL Mapping Demonstration with experimental data Merit of QTL
More informationSequence Alignment: A General Overview. COMP Fall 2010 Luay Nakhleh, Rice University
Sequence Alignment: A General Overview COMP 571 - Fall 2010 Luay Nakhleh, Rice University Life through Evolution All living organisms are related to each other through evolution This means: any pair of
More informationRobust demographic inference from genomic and SNP data
Robust demographic inference from genomic and SNP data Laurent Excoffier Isabelle Duperret, Emilia Huerta-Sanchez, Matthieu Foll, Vitor Sousa, Isabel Alves Computational and Molecular Population Genetics
More informationTest for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials
Biostatistics (2013), pp. 1 31 doi:10.1093/biostatistics/kxt006 Test for interactions between a genetic marker set and environment in generalized linear models Supplementary Materials XINYI LIN, SEUNGGUEN
More informationBinomial Mixture Model-based Association Tests under Genetic Heterogeneity
Binomial Mixture Model-based Association Tests under Genetic Heterogeneity Hui Zhou, Wei Pan Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN 55455 April 30,
More informationMap of AP-Aligned Bio-Rad Kits with Learning Objectives
Map of AP-Aligned Bio-Rad Kits with Learning Objectives Cover more than one AP Biology Big Idea with these AP-aligned Bio-Rad kits. Big Idea 1 Big Idea 2 Big Idea 3 Big Idea 4 ThINQ! pglo Transformation
More informationGLIDE: GPU-based LInear Detection of Epistasis
GLIDE: GPU-based LInear Detection of Epistasis Chloé-Agathe Azencott with Tony Kam-Thong, Lawrence Cayton, and Karsten Borgwardt Machine Learning and Computational Biology Research Group Max Planck Institute
More informationQuantitative Genomics and Genetics BTRY 4830/6830; PBSB
Quantitative Genomics and Genetics BTRY 4830/6830; PBSB.5201.01 Lecture 20: Epistasis and Alternative Tests in GWAS Jason Mezey jgm45@cornell.edu April 16, 2016 (Th) 8:40-9:55 None Announcements Summary
More informationGenotyping By Sequencing (GBS) Method Overview
enotyping By Sequencing (BS) Method Overview Sharon E Mitchell Institute for enomic Diversity Cornell University http://wwwmaizegeneticsnet/ Topics Presented Background/oals BS lab protocol Illumina sequencing
More information