Discovery of Genomic Structural Variations with Next-Generation Sequencing Data
|
|
- Kathlyn McGee
- 5 years ago
- Views:
Transcription
1 Discovery of Genomic Structural Variations with Next-Generation Sequencing Data Advanced Topics in Computational Genomics Slides from Marcel H. Schulz, Tobias Rausch (EMBL), and Kai Ye (Leiden University)
2 Computational Methods
3 Detecting Genomic Rearrangements Reference Mate-pair or paired-end mapping abnormalities Split-Read alignments Read depth signals courtesy of Tobias Rausch (EMBL)
4 Detecting Genomic Rearrangements Reference Unmapped or single-anchored reads Mate-pair or paired-end mapping abnormalities Split-Read alignments Read depth signals Local assembly courtesy of Tobias Rausch (EMBL)
5 courtesy of Tobias Rausch (EMBL)
6 courtesy of Tobias Rausch (EMBL)
7 Insertions Deletions courtesy of Tobias Rausch (EMBL)
8 Lee et al. (2009) courtesy of Tobias Rausch (EMBL) Korbel et al. (2007)
9 courtesy of Tobias Rausch (EMBL)
10 courtesy of Tobias Rausch (EMBL)
11 courtesy of Tobias Rausch (EMBL)
12 courtesy of Tobias Rausch (EMBL)
13 1 Copy 1 Copy 0 Copy 2 Copy 2 Copy courtesy of Tobias Rausch (EMBL) Chiang et al. (2009)
14 Down-Syndrom Partial Trisomie 21 courtesy of Tobias Rausch (EMBL) Xie et al. (2009)
15 Human cancer cell lines compared to normal cell lines (SeqSeq algorithm, no fixed window size, multiple change points method ) Chiang et al. (2009)
16 With reads of length bps are we able to find the exact breakpoint of a structural variation?
17 With reads of length bps are we able to find the exact breakpoint of a structural variation? Yes using split-read mapping Donor Reference Example for read of length 40: Expected random matches for a 12bp read-prefix in the human genome?
18 With reads of length bps are we able to find the exact breakpoint of a structural variation? Yes using split-read mapping Donor Reference Example for read of length 40: Expected random matches for a 12bp read-prefix in the human genome?
19 With reads of length bps are we able to find the exact breakpoint of a structural variation? Yes using anchored split-read mapping Donor Reference mappable read mate provides anchor to narrow down search space Medvedev et al. (2009)
20 The Pindel algorithm (Deletions) How to do that? Ye et al. (2009)
21 The Pindel algorithm (Deletions) Use 3 end of left read as anchor point Use pattern growth to search for minimum and maximum unique substrings from the 3 end of the unmapped read (<=2x insert size) Ye et al. (2009)
22 #&)-./!'0&12-./!(3!%0&&$).!/)45&2 ATGCA ATCAAGTATGCTTAGC!"!#$%&$'($)!*!++ +, courtesy of Kai Ye (Leiden U.)
23 #&)-./!'0&12-./!(3!%0&&$).!/)45&2 ATGCA ATCAAGTATGCTTAGC!"!#$%&$'($)!*!++ +, courtesy of Kai Ye (Leiden U.)
24 #&)-./!'0&12-./!(3!%0&&$).!/)45&2 ATGCA ATCAAGTATGCTTAGC!"!#$%&$'($)!*!++ +, courtesy of Kai Ye (Leiden U.)
25 #&)-./!'0&12-./!(3!%0&&$).!/)45&2 ATGCA ATCAAGTATGCTTAGC!"!#$%&$'($)!*!++ +, courtesy of Kai Ye (Leiden U.)
26 #&),-.!'/&01,-.!(2!%/&&$)-!.)34&1 ATGCA ATCAAGTATGCTTAGC 5,-,'6'!6-,76$!86(8&),-.9!:;< 5/=,'6'!6-,76$!86(8&),-.9!:;<>!"!#$%&$'($)!*!++ *! courtesy of Kai Ye (Leiden U.)
27 The Pindel algorithm (Deletions) Use 3 end of left read as anchor point Use pattern growth to search for minimum and maximum unique substrings from the 3 end of the unmapped read (<=2x insert size) Use pattern growth to search for minimum and maximum unique substrings from the 5 end of the unmapped read (read length + Max_D) starting from mapped end in step 2 Ye et al. (2009)
28 The Pindel algorithm (Deletions) Use 3 end of left read as anchor point Use pattern growth to search for minimum and maximum unique substrings from the 3 end of the unmapped read (<=2x insert size) Use pattern growth to search for minimum and maximum unique substrings from the 5 end of the unmapped read (read length + Max_D) starting from mapped end in step 2 check if complete unmapped read can be combined from 3 and 5 end substrings matches Ye et al. (2009)
29 The Pindel algorithm (Insertions) Use 3 end of left read as anchor point Use pattern growth to search for minimum and maximum unique substrings from the 3 end of the unmapped read (<=2x insert size) Use pattern growth to search for minimum and maximum unique substrings from the 5 end of the unmapped read (read length -1) starting from mapped end in step 2 check if complete unmapped read can be combined from 3 and 5 end substrings matches Ye et al. (2009)
30 The Pindel algorithm (Insertions) Use 3 end of left read as anchor point Use pattern growth to search for minimum and maximum unique substrings from the 3 end of the unmapped read (<=2x insert size) Use pattern growth to search for minimum and maximum unique substrings from the 5 end of the unmapped read (read length -1) starting from mapped end in step 2 check if complete unmapped read can be combined from 3 and 5 end substrings matches In initial Pindel version exact matches to reference where required Ye et al. (2009)
31 The Pindel algorithm (Real Data) Ye et al. (2009)
32 The Pindel algorithm (Real Data) Ye et al. (2009)
33 The Pindel algorithm for complex variants a) large deletion b) tandem duplication c) inversion d-f) same as a-c with non-template sequence (yellow part) Ye et al. Pindel manual
34 Acknowledgements Tobias Rausch (EMBL) Kai Ye (Leiden University Medical Center) Anne-Katrin Emde (Freie Universität Berlin) References Kai Ye, Marcel H. Schulz, Quan Long, Rolf Apweiler, and Zemin Ning Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics (2009) 25(21): Pindel homepage: SplazerS homepage:
Going Beyond SNPs with Next Genera5on Sequencing Technology Personalized Medicine: Understanding Your Own Genome Fall 2014
Going Beyond SNPs with Next Genera5on Sequencing Technology 02-223 Personalized Medicine: Understanding Your Own Genome Fall 2014 Next Genera5on Sequencing Technology (NGS) NGS technology Discover more
More informationHigh-throughput sequence alignment. November 9, 2017
High-throughput sequence alignment November 9, 2017 a little history human genome project #1 (many U.S. government agencies and large institute) started October 1, 1990. Goal: 10x coverage of human genome,
More informationGenome Rearrangements In Man and Mouse. Abhinav Tiwari Department of Bioengineering
Genome Rearrangements In Man and Mouse Abhinav Tiwari Department of Bioengineering Genome Rearrangement Scrambling of the order of the genome during evolution Operations on chromosomes Reversal Translocation
More informationThe breakpoint distance for signed sequences
The breakpoint distance for signed sequences Guillaume Blin 1, Cedric Chauve 2 Guillaume Fertin 1 and 1 LINA, FRE CNRS 2729 2 LACIM et Département d'informatique, Université de Nantes, Université du Québec
More informationBMI/CS 776 Lecture #20 Alignment of whole genomes. Colin Dewey (with slides adapted from those by Mark Craven)
BMI/CS 776 Lecture #20 Alignment of whole genomes Colin Dewey (with slides adapted from those by Mark Craven) 2007.03.29 1 Multiple whole genome alignment Input set of whole genome sequences genomes diverged
More informationGenomes Comparision via de Bruijn graphs
Genomes Comparision via de Bruijn graphs Student: Ilya Minkin Advisor: Son Pham St. Petersburg Academic University June 4, 2012 1 / 19 Synteny Blocks: Algorithmic challenge Suppose that we are given two
More informationMinimal Height and Sequence Constrained Longest Increasing Subsequence
Minimal Height and Sequence Constrained Longest Increasing Subsequence Chiou-Ting Tseng, Chang-Biau Yang and Hsing-Yen Ann Department of Computer Science and Engineering National Sun Yat-sen University,
More informationI519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB
I519 Introduction to Bioinformatics, 2011 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism
More information7 Multiple Genome Alignment
94 Bioinformatics I, WS /3, D. Huson, December 3, 0 7 Multiple Genome Alignment Assume we have a set of genomes G,..., G t that we want to align with each other. If they are short and very closely related,
More informationMultiple Whole Genome Alignment
Multiple Whole Genome Alignment BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 206 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by
More informationI519 Introduction to Bioinformatics, Genome Comparison. Yuzhen Ye School of Informatics & Computing, IUB
I519 Introduction to Bioinformatics, 2015 Genome Comparison Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Whole genome comparison/alignment Build better phylogenies Identify polymorphism
More information17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:
17 Non-collinear alignment This exposition is based on: 1. Darling, A.E., Mau, B., Perna, N.T. (2010) progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147.
More informationPaired-End Read Length Lower Bounds for Genome Re-sequencing
1/11 Paired-End Read Length Lower Bounds for Genome Re-sequencing Rayan Chikhi ENS Cachan Brittany PhD student in the Symbiose team, Irisa, France 2/11 NEXT-GENERATION SEQUENCING Next-gen vs. traditional
More informationComputational Genetics Winter 2013 Lecture 10. Eleazar Eskin University of California, Los Angeles
Computational Genetics Winter 2013 Lecture 10 Eleazar Eskin University of California, Los ngeles Pair End Sequencing Lecture 10. February 20th, 2013 (Slides from Ben Raphael) Chromosome Painting: Normal
More informationCharacterization of Structural Variants with Single Molecule and Hybrid Sequencing Approaches
Bioinformatics Advance Access published October 28, 2014 Characterization of Structural Variants with Single Molecule and Hybrid Sequencing Approaches Anna Ritz 1,,, Ali Bashir 2,3, Suzanne Sindi 4, David
More informationNature Genetics: doi:0.1038/ng.2768
Supplementary Figure 1: Graphic representation of the duplicated region at Xq28 in each one of the 31 samples as revealed by acgh. Duplications are represented in red and triplications in blue. Top: Genomic
More informationAssembly improvement: based on Ragout approach. student: Anna Lioznova scientific advisor: Son Pham
Assembly improvement: based on Ragout approach student: Anna Lioznova scientific advisor: Son Pham Plan Ragout overview Datasets Assembly improvements Quality overlap graph paired-end reads Coverage Plan
More informationarxiv: v1 [q-bio.gn] 5 Mar 2012
CLEVER: Clique-Enumerating Variant Finder Tobias Marschall 1, Ivan Costa 2, Stefan Canzar 1, Markus Bauer 3, Gunnar Klau 1, Alexander Schliep 4, Alexander Schönhuth 1 arxiv:1203.0937v1 [q-bio.gn] 5 Mar
More informationChromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre
PhD defense Chromosomal rearrangements in mammalian genomes : characterising the breakpoints Claire Lemaitre Laboratoire de Biométrie et Biologie Évolutive Université Claude Bernard Lyon 1 6 novembre 2008
More informationLinear-Space Alignment
Linear-Space Alignment Subsequences and Substrings Definition A string x is a substring of a string x, if x = ux v for some prefix string u and suffix string v (similarly, x = x i x j, for some 1 i j x
More informationRGP finder: prediction of Genomic Islands
Training courses on MicroScope platform RGP finder: prediction of Genomic Islands Dynamics of bacterial genomes Gene gain Horizontal gene transfer Gene loss Deletion of one or several genes Duplication
More informationLocal Alignment: Smith-Waterman algorithm
Local Alignment: Smith-Waterman algorithm Example: a shared common domain of two protein sequences; extended sections of genomic DNA sequence. Sensitive to detect similarity in highly diverged sequences.
More informationWhole Genome Alignments and Synteny Maps
Whole Genome Alignments and Synteny Maps IINTRODUCTION It was not until closely related organism genomes have been sequenced that people start to think about aligning genomes and chromosomes instead of
More informationAnalysis and Design of Algorithms Dynamic Programming
Analysis and Design of Algorithms Dynamic Programming Lecture Notes by Dr. Wang, Rui Fall 2008 Department of Computer Science Ocean University of China November 6, 2009 Introduction 2 Introduction..................................................................
More informationMACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance
MACFP: Maximal Approximate Consecutive Frequent Pattern Mining under Edit Distance Jingbo Shang, Jian Peng, Jiawei Han University of Illinois, Urbana-Champaign May 6, 2016 Presented by Jingbo Shang 2 Outline
More informationAlgorithms for Bioinformatics
Adapted from slides by Alexandru Tomescu, Leena Salmela, Veli Mäkinen, Esa Pitkänen 582670 Algorithms for Bioinformatics Lecture 5: Combinatorial Algorithms and Genomic Rearrangements 1.10.2015 Background
More informationBIO GENETICS CHROMOSOME MUTATIONS
BIO 390 - GENETICS CHROMOSOME MUTATIONS OVERVIEW - Multiples of complete sets of chromosomes are called polyploidy. Even numbers are usually fertile. Odd numbers are usually sterile. - Aneuploidy refers
More informationEfficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem
Efficient Polynomial-Time Algorithms for Variants of the Multiple Constrained LCS Problem Hsing-Yen Ann National Center for High-Performance Computing Tainan 74147, Taiwan Chang-Biau Yang and Chiou-Ting
More informationSequence Alignment (chapter 6)
Sequence lignment (chapter 6) he biological problem lobal alignment Local alignment Multiple alignment Introduction to bioinformatics, utumn 6 Background: comparative genomics Basic question in biology:
More information3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT
3. SEQUENCE ANALYSIS BIOINFORMATICS COURSE MTAT.03.239 25.09.2012 SEQUENCE ANALYSIS IS IMPORTANT FOR... Prediction of function Gene finding the process of identifying the regions of genomic DNA that encode
More informationHidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays
Hidden Markov Models for the Assessment of Chromosomal Alterations using High-throughput SNP Arrays Department of Biostatistics Johns Hopkins Bloomberg School of Public Health November 18, 2008 Acknowledgments
More information1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine.
Protein Synthesis & Mutations RNA 1. Contains the sugar ribose instead of deoxyribose. 2. Single-stranded instead of double stranded. 3. Contains uracil in place of thymine. RNA Contains: 1. Adenine 2.
More informationImplementing Approximate Regularities
Implementing Approximate Regularities Manolis Christodoulakis Costas S. Iliopoulos Department of Computer Science King s College London Kunsoo Park School of Computer Science and Engineering, Seoul National
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Evaluation. Course Homepage.
CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools Giri Narasimhan ECS 389; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs06.html 1/12/06 CAP5510/CGS5166 1 Evaluation
More informationVariant visualisation and quality control
Variant visualisation and quality control You really should be making plots! 25/06/14 Paul Theodor Pyl 1 Classical Sequencing Example DNA.BAM.VCF Aligner Variant Caller A single sample sequencing run 25/06/14
More informationPerfect Sorting by Reversals and Deletions/Insertions
The Ninth International Symposium on Operations Research and Its Applications (ISORA 10) Chengdu-Jiuzhaigou, China, August 19 23, 2010 Copyright 2010 ORSC & APORC, pp. 512 518 Perfect Sorting by Reversals
More informationBackground: comparative genomics. Sequence similarity. Homologs. Similarity vs homology (2) Similarity vs homology. Sequence Alignment (chapter 6)
Sequence lignment (chapter ) he biological problem lobal alignment Local alignment Multiple alignment Background: comparative genomics Basic question in biology: what properties are shared among organisms?
More informationHandling Rearrangements in DNA Sequence Alignment
Handling Rearrangements in DNA Sequence Alignment Maneesh Bhand 12/5/10 1 Introduction Sequence alignment is one of the core problems of bioinformatics, with a broad range of applications such as genome
More informationChapter 10: Meiosis and Sexual Reproduction
Chapter 10: Meiosis and Sexual Reproduction AP Curriculum Alignment The preservation and continuity of genetic material that is being passed from generation to generation in sexually reproducing organisms
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 03: Edit distance and sequence alignment Slides adapted from Dr. Shaojie Zhang (University of Central Florida) KUMC visit How many of you would like to attend
More informationCell Growth and Genetics
Cell Growth and Genetics Cell Division (Mitosis) Cell division results in two identical daughter cells. The process of cell divisions occurs in three parts: Interphase - duplication of chromosomes and
More informationADMM Fused Lasso for Copy Number Variation Detection in Human 3 March Genomes / 1
ADMM Fused Lasso for Copy Number Variation Detection in Human Genomes Yifei Chen and Jacob Biesinger 3 March 2011 ADMM Fused Lasso for Copy Number Variation Detection in Human 3 March Genomes 2011 1 /
More informationBloom Filters, Minhashes, and Other Random Stuff
Bloom Filters, Minhashes, and Other Random Stuff Brian Brubach University of Maryland, College Park StringBio 2018, University of Central Florida What? Probabilistic Space-efficient Fast Not exact Why?
More informationPairwise & Multiple sequence alignments
Pairwise & Multiple sequence alignments Urmila Kulkarni-Kale Bioinformatics Centre 411 007 urmila@bioinfo.ernet.in Basis for Sequence comparison Theory of evolution: gene sequences have evolved/derived
More informationA PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS
A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology
More informationCancer: DNA Synthesis, Mitosis, and Meiosis
Chapter 5 Cancer: DNA Synthesis, Mitosis, and Meiosis Copyright 2007 Pearson Copyright Prentice Hall, 2007 Inc. Pearson Prentice Hall, Inc. 1 5.6 Meiosis Another form of cell division, meiosis, occurs
More informationGraduate Funding Information Center
Graduate Funding Information Center UNC-Chapel Hill, The Graduate School Graduate Student Proposal Sponsor: Program Title: NESCent Graduate Fellowship Department: Biology Funding Type: Fellowship Year:
More informationAn Integrated Approach for the Assessment of Chromosomal Abnormalities
An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 26, 2007 Karyotypes Karyotypes General Cytogenetics
More informationGenome Sequencing and Structural Variation (2)
Genome Sequencing and Variation Analysis of matepairs for the identification of variants Institut für Medizinische Genetik und Humangenetik Charité Universitätsmedizin Berlin Genomics: Lecture #11 Today
More informationWhat happens to the replicated chromosomes? depends on the goal of the division
Segregating the replicated chromosomes What happens to the replicated chromosomes? depends on the goal of the division - to make more vegetative cells: mitosis daughter cells chromosome set should be identical
More informationCourse: Visual Analytics of largescale biological data. Kay Nieselt Center for Bioinformatics Tübingen University of Tübingen
Course: Visual Analytics of largescale biological data Kay Nieselt Center for Bioinformatics Tübingen University of Tübingen THE SUPERGENOME AND GENOMERING Overview A revolution in genomics Flood of genomes:
More informationCHAPTER 15 LECTURE SLIDES
CHAPTER 15 LECTURE SLIDES Prepared by Brenda Leady University of Toledo To run the animations you must be in Slideshow View. Use the buttons on the animation to play, pause, and turn audio/text on or off.
More informationReducing storage requirements for biological sequence comparison
Bioinformatics Advance Access published July 15, 2004 Bioinfor matics Oxford University Press 2004; all rights reserved. Reducing storage requirements for biological sequence comparison Michael Roberts,
More informationList of Code Challenges. About the Textbook Meet the Authors... xix Meet the Development Team... xx Acknowledgments... xxi
Contents List of Code Challenges xvii About the Textbook xix Meet the Authors................................... xix Meet the Development Team............................ xx Acknowledgments..................................
More informationA DNA Sequence 2017/12/6 1
A DNA Sequence ccgtacgtacgtagagtgctagtctagtcgtagcgccgtagtcgatcgtgtgg gtagtagctgatatgatgcgaggtaggggataggatagcaacagatgagc ggatgctgagtgcagtggcatgcgatgtcgatgatagcggtaggtagacttc gcgcataaagctgcgcgagatgattgcaaagragttagatgagctgatgcta
More informationThe algorithm of equal acceptance region for detecting copy number alterations: applications to next-generation sequencing data
University of Wollongong Research Online Centre for Statistical & Survey Methodology Working Paper Series Faculty of Engineering and Information Sciences 2011 The algorithm of equal acceptance region for
More informationSequence analysis and Genomics
Sequence analysis and Genomics October 12 th November 23 rd 2 PM 5 PM Prof. Peter Stadler Dr. Katja Nowick Katja: group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute
More informationPatterns of Simple Gene Assembly in Ciliates
Patterns of Simple Gene Assembly in Ciliates Tero Harju Department of Mathematics, University of Turku Turku 20014 Finland harju@utu.fi Ion Petre Academy of Finland and Department of Information Technologies
More informationDe novo assembly and genotyping of variants using colored de Bruijn graphs
De novo assembly and genotyping of variants using colored de Bruijn graphs Iqbal et al. 2012 Kolmogorov Mikhail 2013 Challenges Detecting genetic variants that are highly divergent from a reference Detecting
More informationSupplementary Figure 1. Phenotype of the HI strain.
Supplementary Figure 1. Phenotype of the HI strain. (A) Phenotype of the HI and wild type plant after flowering (~1month). Wild type plant is tall with well elongated inflorescence. All four HI plants
More informationDesigning and Testing a New DNA Fragment Assembler VEDA-2
Designing and Testing a New DNA Fragment Assembler VEDA-2 Mark K. Goldberg Darren T. Lim Rensselaer Polytechnic Institute Computer Science Department {goldberg, limd}@cs.rpi.edu Abstract We present VEDA-2,
More informationAn Integrated Approach for the Assessment of Chromosomal Abnormalities
An Integrated Approach for the Assessment of Chromosomal Abnormalities Department of Biostatistics Johns Hopkins Bloomberg School of Public Health June 6, 2007 Karyotypes Mitosis and Meiosis Meiosis Meiosis
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationDescribe the process of cell division in prokaryotic cells. The Cell Cycle
The Cell Cycle Objective # 1 In this topic we will examine the cell cycle, the series of changes that a cell goes through from one division to the next. We will pay particular attention to how the genetic
More informationIntroduction to Sequence Alignment. Manpreet S. Katari
Introduction to Sequence Alignment Manpreet S. Katari 1 Outline 1. Global vs. local approaches to aligning sequences 1. Dot Plots 2. BLAST 1. Dynamic Programming 3. Hash Tables 1. BLAT 4. BWT (Burrow Wheeler
More informationBioinformatics Exercises
Bioinformatics Exercises AP Biology Teachers Workshop Susan Cates, Ph.D. Evolution of Species Phylogenetic Trees show the relatedness of organisms Common Ancestor (Root of the tree) 1 Rooted vs. Unrooted
More informationChapter 11 Chromosome Mutations. Changes in chromosome number Chromosomal rearrangements Evolution of genomes
Chapter 11 Chromosome Mutations Changes in chromosome number Chromosomal rearrangements Evolution of genomes Aberrant chromosome constitutions of a normally diploid organism Name Designation Constitution
More informationChapter 9 Sexual Reproduction and Meiosis
Chapter 9 Sexual Reproduction and Meiosis Ultrasound: Chad Ehlers/Glow Images Copyright McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of
More informationARTICLE IN PRESS Discrete Applied Mathematics ( )
Discrete Applied Mathematics ( ) Contents lists available at ScienceDirect Discrete Applied Mathematics journal homepage: www.elsevier.com/locate/dam Repetition-free longest common subsequence Said S.
More informationPhylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.
Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class
More informationMolecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment
Molecular Modeling 2018-- Lecture 7 Homology modeling insertions/deletions manual realignment Homology modeling also called comparative modeling Sequences that have similar sequence have similar structure.
More information10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison
10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:
More informationPhylogenetic Networks with Recombination
Phylogenetic Networks with Recombination October 17 2012 Recombination All DNA is recombinant DNA... [The] natural process of recombination and mutation have acted throughout evolution... Genetic exchange
More informationSequence comparison by compression
Sequence comparison by compression Motivation similarity as a marker for homology. And homology is used to infer function. Sometimes, we are only interested in a numerical distance between two sequences.
More informationSpecial Topics on Genetics
ARISTOTLE UNIVERSITY OF THESSALONIKI OPEN COURSES Section 9: Transposable elements Drosopoulou E License The offered educational material is subject to Creative Commons licensing. For educational material,
More informationSequence analysis and comparison
The aim with sequence identification: Sequence analysis and comparison Marjolein Thunnissen Lund September 2012 Is there any known protein sequence that is homologous to mine? Are there any other species
More informationOverview of IslandPick pipeline and the generation of GI datasets
Overview of IslandPick pipeline and the generation of GI datasets Predicting GIs using comparative genomics By using whole genome alignments we can identify regions that are present in one genome but not
More informationCycle «Analyse de données de séquençage à haut-débit»
Cycle «Analyse de données de séquençage à haut-débit» Module 1/5 Analyse ADN Chadi Saad CRIStAL - Équipe BONSAI - Univ Lille, CNRS, INRIA (chadi.saad@univ-lille.fr) Présentation de Sophie Gallina (source:
More informationAppendix B Microsoft Office Specialist exam objectives maps
B 1 Appendix B Microsoft Office Specialist exam objectives maps This appendix covers these additional topics: A Excel 2003 Specialist exam objectives with references to corresponding material in Course
More informationBioinformatics tools for phylogeny and visualization. Yanbin Yin
Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and
More informationSamson Zhou. Pattern Matching over Noisy Data Streams
Samson Zhou Pattern Matching over Noisy Data Streams Finding Structure in Data Pattern Matching Finding all instances of a pattern within a string ABCD ABCAABCDAACAABCDBCABCDADDDEAEABCDA Knuth-Morris-Pratt
More informationEnsembl Exercise Answers Adapted from Ensembl tutorials presented by Dr. Bert Overduin, EBI
Ensembl Exercise Answers Adapted from Ensembl tutorials presented by Dr. Bert Overduin, EBI Exercise 1 Exploring the human MYH9 gene (a) Go to the Ensembl homepage (http://www.ensembl.org). Select Search:
More information5.1 Cell Division and the Cell Cycle
5.1 Cell Division and the Cell Cycle Lesson Objectives Contrast cell division in prokaryotes and eukaryotes. Identify the phases of the eukaryotic cell cycle. Explain how the cell cycle is controlled.
More informationLecture 4: September 19
CSCI1810: Computational Molecular Biology Fall 2017 Lecture 4: September 19 Lecturer: Sorin Istrail Scribe: Cyrus Cousins Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes
More informationLinear models for the joint analysis of multiple. array-cgh profiles
Linear models for the joint analysis of multiple array-cgh profiles F. Picard, E. Lebarbier, B. Thiam, S. Robin. UMR 5558 CNRS Univ. Lyon 1, Lyon UMR 518 AgroParisTech/INRA, F-75231, Paris Statistics for
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison
CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture
More informationGenomes and Their Evolution
Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from
More informationQQ 10/5/18 Copy the following into notebook:
Chapter 13- Meiosis QQ 10/5/18 Copy the following into notebook: Similarities: 1. 2. 3. 4. 5. Differences: 1. 2. 3. 4. 5. Figure 13.1 Living organisms are distinguished by their ability to reproduce their
More information4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we
More informationGeneral context Anchor-based method Evaluation Discussion. CoCoGen meeting. Accuracy of the anchor-based strategy for genome alignment.
CoCoGen meeting Accuracy of the anchor-based strategy for genome alignment Raluca Uricaru LIRMM, CNRS Université de Montpellier 2 3 octobre 2008 1 / 31 Summary 1 General context 2 Global alignment : anchor-based
More informationWHERE DOES THE VARIATION COME FROM IN THE FIRST PLACE?
What factors contribute to phenotypic variation? The world s tallest man, Sultan Kosen (8 feet 1 inch) towers over the world s smallest, He Ping (2 feet 5 inches). WHERE DOES THE VARIATION COME FROM IN
More informationMicrobes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS. Elizabeth Tseng
Microbes and you ON THE LATEST HUMAN MICROBIOME DISCOVERIES, COMPUTATIONAL QUESTIONS AND SOME SOLUTIONS Elizabeth Tseng Dept. of CSE, University of Washington Johanna Lampe Lab, Fred Hutchinson Cancer
More informationSupplementary Information for Discovery and characterization of indel and point mutations
Supplementary Information for Discovery and characterization of indel and point mutations using DeNovoGear Avinash Ramu 1 Michiel J. Noordam 1 Rachel S. Schwartz 2 Arthur Wuster 3 Matthew E. Hurles 3 Reed
More informationCMPS 3110: Bioinformatics. Tertiary Structure Prediction
CMPS 3110: Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the laws of physics! Conformation space is finite
More informationCMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction
CMPS 6630: Introduction to Computational Biology and Bioinformatics Tertiary Structure Prediction Tertiary Structure Prediction Why Should Tertiary Structure Prediction Be Possible? Molecules obey the
More informationIntroduction to Bioinformatics Introduction to Bioinformatics
Dr. rer. nat. Gong Jing Cancer Research Center Medicine School of Shandong University 2012.11.09 1 Chapter 4 Phylogenetic Tree 2 Phylogeny Evidence from morphological ( 形态学的 ), biochemical, and gene sequence
More informationOBLIVIOUS STRING EMBEDDINGS AND EDIT DISTANCE APPROXIMATIONS
OBLIVIOUS STRING EMBEDDINGS AND EDIT DISTANCE APPROXIMATIONS Tuğkan Batu a, Funda Ergun b, and Cenk Sahinalp b a LONDON SCHOOL OF ECONOMICS b SIMON FRASER UNIVERSITY LSE CDAM Seminar Oblivious String Embeddings
More informationComplexity of Biomolecular Sequences
Complexity of Biomolecular Sequences Institute of Signal Processing Tampere University of Technology Tampere University of Technology Page 1 Outline ➀ ➁ ➂ ➃ ➄ ➅ ➆ Introduction Biological Preliminaries
More informationUnit 2: Characteristics of Living Things Lesson 25: Mitosis
Name Unit 2: Characteristics of Living Things Lesson 25: Mitosis Objective: Students will be able to explain the phases of Mitosis. Date Essential Questions: 1. What are the phases of the eukaryotic cell
More informationA metric approach for. comparing DNA sequences
A metric approach for comparing DNA sequences H. Mora-Mora Department of Computer and Information Technology University of Alicante, Alicante, Spain M. Lloret-Climent Department of Applied Mathematics.
More information