Evolution by duplication

Size: px
Start display at page:

Download "Evolution by duplication"

Transcription

1 6.095/ Computational Biology: Genomes, Networks, Evolution Lecture 18 Nov 10, 2005 Evolution by duplication Somewhere, something went wrong

2 Challenges in Computational Biology 4 Genome Assembly Regulatory motif discovery Gene Finding DNA Sequence alignment 8 Comparative Genomics TCATGCTAT TCGTGATAA TGAGGATAT 7 Evolutionary Theory TTATCATAT TTATGATTT Database lookup RNA folding 9 Gene expression analysis 12 Protein network analysis RNA transcript 10 Cluster discovery Gibbs sampling 13 Regulatory network inference 14 Emerging network properties

3 Open questions (?) Image removed due to copyright restrictions. Image removed due to copyright restrictions. Image removed due to copyright restrictions. Panda Bear or raccoon? Out of Africa mitochondrial evolution story? Human evolution Did we ever meet Neanderthal? Primate evolution Are we chimp-like or gorilla-like? Vertebrate evolution How did complex body plans arise? Recent evolution What genes are under selection?

4 What we have learned Phylogenetic trees Distance-based methods UPGMA, Neighbor-Joining Alignment-based methods Parsimony: set-based, dynamic programming Evolution by nucleotide mutation Probability of back-mutation Markov chain Models of evolution Jukes-Cantor: Kimura 2-parameter model Evolution by rearrangements Sorting by reversals Signed / unsigned version & approximation algorithms

5 Today s goals: Evolution by Duplication Detecting gene duplication Orthologs and paralogs Gene trees and species trees Reconciliation Detecting genome duplication Evidence across species Evidence in a single species Duplicate gene evolution Detect accelerated divergence Measuring positive selection Gene conversion

6 Determining orthologs and paralogs

7 Orthologs and paralogs human mouse rat dog rabbit orthologs paralogs Orthologs arise by speciation typically keep same function Paralogs arise by duplication typically take on new functions Ortholog identification a prerequisite to genomic studies

8 Why are orthologs & paralogs important? Comparative genomics relies on correct orthology Signal discovery by orthologous conservation Evolutionary genomics relies on complete mapping Duplicated regions are also the most interesting ones Image removed due to copyright restrictions. Please see: Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 2004): Whole-genome duplication in yeast, fish, and vertebrates

9 Challenges in genome-wide orthology Tens of thousands of genes Abundant duplication and loss Spurious matches Noisy data Many paralogous families precede species divergence Single phylogeny is impossible not enough traits Protein family expansions Gene conversion, loss, inactivation Common domains in unrelated proteins Similarity not always due to common ancestry Varying rates of mutation (gene & species) Pseudogenes, incorrect/incomplete gene models Goal: Systematic ortholog identification across multiple, complete, mammalian genomes

10 Current methods for ortholog finding Pair-wise sequence comparison Hit clustering methods Synteny methods Phylogenetic methods Best bi-directional BLAST hits Focuses on one-to-one orthologs (no duplications) Detect clusters in graph of pair-wise hits Difficulty to separate large connected components Detect conserved regions, stretches of nearby hits Genome alignment methods focus on best hits Phylogeny of family clusters orthologs near each other Traditionally applied to specific families (not genome-wide) Current methods successful in limited datasets Complete mammalian genomes present new challenges

11 Algorithm: SynPhyl Images removed due to copyright restrictions. Combine synteny and phylogeny to find orthologs Initial gene family construction Build phylogenetic trees within families Reconcile gene trees to determine orthology

12 Building Meaningful Gene Families

13 Step 1. Initial gene family construction Challenge: How to keep cluster sizes balanced Limitations of traditional clustering methods UPGMA, k-means, graph-partitioning lead to imbalance Bi-partitioning methods lead to arbitrary midway splitting SynPhyl approach: a. Seed clusters with unambiguous hits b. Extend clusters in gene pulling step c. Refine clusters in phylogeny step Balanced Clusters

14 Step 1. Initial gene family construction (1) Initial cluster seeds from unambiguous matches Syntenic orthologs Multi-species significant BBH Human BBH component Dog human Mouse dog mouse Rat Initial gene clusters

15 Step 2. Cluster extension (1) Initial cluster seeds from unambiguous matches (2) Cluster extension Pull unassigned genes to existing clusters Ensure distance of new gene within cluster distribution Unassigned genes Initial gene clusters

16 Step 3. Phylogenetic reconstruction (1) Initial cluster seeds from unambiguous matches (2) Cluster extension (3) Phylogenetic reconstruction Phylogeny for each cluster Align each cluster (MUSCLE protein alignment) Neighbor-Joining: fast, distance-based (JTT model) Bootstrapping used for confidence measure, propagates Use phylogeny to further separate clusters Reconciliation Four mammals - 78,744 genes - 17,586 trees - Largest:` 103 genes Ten fungi - 54,890 genes - 5,537 trees - Largest: 164 genes 80% 60% 90% 90% Extended gene clusters

17 Bootstrap confidence scores Repeat 100 times Gene cluster Alignment Sample with replacement Bootstrapping: Sample columns from the alignment randomly Build trees based on these columns (NJ, ML, MP) For every internal branch Count how many topologies agree with inferred split Percentage is the bootstrap confidence score Building a final tree Full tree, using all the data Consensus tree Tree

18 Phylogenetic Tree Reconciliation Gene Tree Ù Species Tree

19 Gene Tree / Species Tree reconciliation Known species tree G1: Each species contains each subfamily Easy to infer duplication events G2: Loss events in each family hide complex ancestry Reconciliation with species tree recovers the events

20 Reconciliation to determine orthology Reconcile each gene tree to the species tree Each node in gene tree maps to node in species tree Read off orthology and paralogy Infer gene duplication and loss events Gene tree Species tree d 1 h 1 m 1 r 1 m 2 r 2 gene loss in chimp gene duplication in rodent ancestor dog human chimp mouse rat

21 Reconciliation algorithm For every node g, decide duplication or speciation Map left child to tree Æ M(a). Map right child to tree Æ M(b) M(g) is least common ancestor of M(a) and M(b) After mapping: g is a duplication node if M(g)={M(a) or M(b)} g is a speciation node if M(g) is distinct from its children Post-processing: count loss edges Limitation: Reconciliation assumes correct species tree Generally NOT the case

22 Mammalian tree: Abundance of alternate tree topologies Most trees are incorrect Count most frequent subtrees of size four Correct species tree a minority <20% Reason: Long branch attraction Due to rapidly evolving rodent lineage Common phylogenetic reconstruction problem What happens to reconciliation?

23 Reconciliation with erroneous trees Gene tree Species tree duplication D H M R D H M R D H M R With erroneous trees: Direct reconciliation leads to spurious duplications & losses Solution: Use species tree to constrain gene tree

24 Towards better reconciliation methods Gene Tree Species Tree new root d 1 h 1 m 1 r 1 Topology 1 d 2 h 2 m m 2 r 2 3 r 3 dog Topology 2 Full solution: Maximize joint likelihood Incorporate cost of reconciliation in tree building Tradeoff: nucleotide mutations & gene duplication/loss One solution: Partitioning by Reconciliation human Key insight: most errors are on older branches, irrelevant to orthology Use species tree to partition gene tree Allow re-rooting of each partition based on species tree Î Apply reconciliation algorithm to each partition mouse rat

25 Step 4: Partitioning by reconciliation (1) Initial cluster seeds (2) Cluster extension (3) Phylogenetic reconstruction Gene Clusters (4) Partitioning by reconciliation Partitioned Trees Partition Unrooted Trees Unrooted Trees Phylogeny Repeat 100 times Rooted Trees Select root Reconciliation Bootstrapping Loop Ortholog assignments with confidence score

26 Putting it all together: SynPhyl Gene Annotations Gene Family Clusters Initial clustering Genome synteny Repeat 100 times Unrooted Trees Partitioned Trees Unrooted Trees Partition Phylogeny Rooted Trees Reconciliation Select root Bootstrapping Loop Ortholog and Paralog Database Assign orthology with confidence scores

27 Benchmarks and Results

28 Results: Mammalian comparisons Compare human, mouse, rat, dog complete genomes Coverage: 75,753 genes Number of groups: 18,446 (of which 13,741 have all four species) One-to-one orthologs in four species: 12,359 Species Present # Groups Dog Human Mouse Rat Count of ortholog groups by species - Human Mouse Rat 752 Dog - Mouse Rat 457 Dog Human - Rat 270 Dog Human Mouse Mouse Rat 502 Dog Human Dog - Mouse Dog - - Rat 97 - Human Mouse Human - Rat 41 Contribution of phylogenetic reconstruction More one-to-one orthologs: 11,619 Æ 12,359 Large families split into small groups: 17,586 Æ 18,446 Figure by MIT OCW.

29 Higher resolution: resolving fine-grain correspondence

30 Higher sensitivity: recognize subtle duplication events S P E C I E S C O M P O S I T I O N S DOG HUMAN MOUSE RAT COUNT OTHER Figure by MIT OCW. Additional duplicates found for ENSEMBL 1-to-1 orthologs Hundreds of additional duplicates detected Confirmed by branch lengths and topology

31 SynPhyl comparison to direct reconciliation Fewer gene losses Fewer gene duplications Direct reconciliation SynPhyl reconciliation Total count of losses: 18,352 11,750 Total count of duplications: 10,114 8,942 More gene trees reconcile to species tree Gene duplications and losses dramatically decreased

32 Result: Genome-wide correspondence of multiple species Image removed due to copyright restrictions.

33 Summary / Contributions SynPhyl: new tool for genome-wide orthology Uses synteny, phylogeny, and known species tree Automatically determines orthologs and paralogs Returns ortholog assignments, trees for each family Algorithmic highlights Initial clustering constrained by synteny Fine-grain correspondence uses phylogeny Partition by reconciliation constrained by species trees Advantages of the algorithm Practical, fast (< ½ day on a PC) Uses information available: phylogeny, synteny Confidence metric: bootstrap values propagate to orthology Phylogeny ensures consistent orthologs (no over-collapsing) Performance Successfully applied to mammals, fungi Fine-grain resolution: phylogeny disambiguates large families High sensitivity: captures all duplication events

34 Outline Detecting gene duplication Orthologs and paralogs Gene trees and species trees Reconciliation Detecting genome duplication Evidence across species Evidence in a single species Duplicate gene evolution Detect accelerated divergence Measuring positive selection Gene conversion

35 Genome Duplication

36 A range of evolutionary distances 20 Myr 5 Myr S.cerevisiae S.paradoxus S.mikatae S.bayanus 100 Myr K. waltii Ability to ask different set of questions

37 Gene correspondence Image removed due to copyright restrictions. Please see: Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004):

38 Gene correspondence Image removed due to copyright restrictions. Please see: Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004):

39 Signatures of evolutionary events Image removed due to copyright restrictions. Please see: Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004): Few genes remain in 2 copies Gene interleaving is evidence of complete duplication

40 Duplicate mapping tiles K. waltii Image removed due to copyright restrictions. Please see: Figure 3 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004):

41 Duplicate mapping of centromeres Image removed due to copyright restrictions. Please see: Figure 2 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004): Recognize sister regions solely based on gene order

42 Conclusion: Whole Genome Duplication has happened Image removed due to copyright restrictions. Please see: Figure 1 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004):

43 Whole Genome Duplications are everywhere! Image removed due to copyright restrictions. Yeast Duplication - Most genes 1-to-1 mapping - Gene interleaving evidence of duplication - Complete tiling of the genome Image removed due to copyright restrictions. Vertebrate Duplication in Fish - Fish: Gene order not conserved, only chromosomes - Mammals: Gene order conserved, not chromosomes Image removed due to copyright restrictions. Two rounds of WGD in base of vertebrate lineage - Build clusters of related genes (use Ciona as outgroup) - Count duplications by reconciliation - Find regions of duplicate overlap Æ 4-way synteny

44 Genome duplication evidence in a single species

45 Evidence of duplication using a single genome? Image removed due to copyright restrictions. Please see: Figure 1 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004): Genomic evidence However Conserved order of paralogous genes Same transcriptional orientation Interspersed with single-copy genes Interpretation: Genome duplication followed by gene loss

46 Whole genome duplication is controversial Insufficient evidence Only 50% of genome in duplicate regions Only 8% of genes present in two copies Extensive redundancy outside duplicate regions Evidence against WGD Divergence-based dating show multiple times Other species have similar level of redundancy Alternative evolutionary scenario proposed Independent segmental duplications Also consistent with the evidence Image removed due to copyright restrictions. Please see: Figure 1 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004): There was a whole-genome duplication. Wolfe, Nature 97 There was no whole-genome duplication. Dujon, FEBS 2000 At least some chrom dup. occurred independently Langkjaer, JMB, 2000 Dynamic equilibrium of duplications and loss Llorente, FEBS, 2000 Recent evidence supports single event. Wong, PNAS 02 Continuous block duplications and deletions Dujon, Yeast 2003 Dup. precedes divergence from Kluyveromyces. Piskur, Nature, 2003 Telomere-mediated duplication events Coissac, Mol Bio Evo 1997 Multiple closely spaced events Friedman, Genome Res, 2003 Spontaneous duplication of large chromosomal segments Koszul, EMBO 04 Evidence remains inconclusive

47 Conclusion: Whole Genome Duplication has happened Image removed due to copyright restrictions. Please see: Figure 1 in Kellis, Manolis, Bruce W. Birren, and Eric S. Lander. "Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae." Nature 428 (April 8, 2004):

48 Outline Detecting gene duplication Orthologs and paralogs Gene trees and species trees Reconciliation Detecting genome duplication Evidence across species Evidence in a single species Duplicate gene evolution Detect accelerated divergence Measuring positive selection Gene conversion

49 Post-duplication evolution

50 Whole-genome duplication results in 500 new genes Number of genes 10,000 5,000 WGD Gene Loss 5,500 ~500 gained 100Myrs Today time Evidence of accelerated gene evolution

51 Fate of duplicated genes 457 genes kept in two copies, result of selection Involved in sugar metabolism and fermentation WGD S. cerevisiae copy 1 S. cerevisiae copy 2 K. waltii Evidence of accelerated protein divergence?

52 Measuring accelerated divergence 1 GTT(V:Val) TTT(F:Phe)? Two shortest paths possible GTA(V:Val) 2 TTA(L:Leu) Protein divergence Count amino-acid changes Use BLOSUM substitution matrix Nucleotide divergence Count nucleotide substitutions Correct for back-mutations Use transition/transversion evolutionary model d N / d S Two types of nucleotide substitutions S = synonymous: Preserve amino-acid translation N = non-synonymous: Change amino-acid Count synonymous / non-synonymous sites Depends on path taken between two codons

53 Scenarios for rapid gene evolution One copy faster Scer - copy2 Scer - copy1 Kwal Ohno, 1970 Both copies faster Scer - copy1 Kwal Scer - copy2 Lynch, % of duplicated genes show acceleration 95% of cases: Only one copy faster

54 Emerging gene functions after duplication Origin of replication Æ silencing 4-fold acceleration Scer Scer - Orc1 (origin of replication) Kwal -Orc1 - Sir3 (silencing) Translation initiation Æ anti-viral defense 3-fold acceleration Scer - Hbs1 (translation initiation) Kwal - Hbs1 Scer - Ski7 (anti-viral defense) Asymmetric divergence Æ recognize ancestral / derived

55 Distinct functional properties Ancestral function Derived function Gene deletion Lethal (20%) Never lethal Gain new function and lose ancestral function

56 Distinct functional properties Ancestral function Derived function Gene deletion Expression Localization Lethal (20%) Abundant General Never lethal Specific (stress, starvation) Specific (mitochondrion, spores) Gain new function and lose ancestral function

57 Gene conversion

58 Decelerated evolution Scer copy1 Scer copy2 Kwal 60 gene pairs (13% of 457 pairs) 98% protein identity (all pairs: 55%) 90% identity in 4fold degenerate sites (all pairs: 41%) Not recent duplication Gene order argues ancestral WGD pairs Gene conversion?

59 Evidence of gene conversion WGD YBL072C S. cerevisiae YER102W S. cerevisiae YBL072C S. bayanus YER102W S. bayanus K. waltii A. gossypii Tree root reveals time of duplication No acceleration in the K. waltii branch The two genes have recently replaced each other Branching order reveals gene conversion Paralogs are closer to each other than to their ortholog Both S. cerevisiae and S. bayanus show gene conversion Periodic gene conversion

60 Summary Detecting gene duplication Orthologs and paralogs Gene trees and species trees Reconciliation Detecting genome duplication Evidence across species Evidence in a single species Duplicate gene evolution Detect accelerated divergence Measuring positive selection Gene conversion

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics - in deriving a phylogeny our goal is simply to reconstruct the historical relationships between a group of taxa. - before we review the

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

1 ATGGGTCTC 2 ATGAGTCTC

1 ATGGGTCTC 2 ATGAGTCTC We need an optimality criterion to choose a best estimate (tree) Other optimality criteria used to choose a best estimate (tree) Parsimony: begins with the assumption that the simplest hypothesis that

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Comparative Genomics II

Comparative Genomics II Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods

More information

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata.

Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Supplementary Note S2 Phylogenetic relationship among S. castellii, S. cerevisiae and C. glabrata. Phylogenetic trees reconstructed by a variety of methods from either single-copy orthologous loci (Class

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Bioinformatics tools for phylogeny and visualization. Yanbin Yin Bioinformatics tools for phylogeny and visualization Yanbin Yin 1 Homework assignment 5 1. Take the MAFFT alignment http://cys.bios.niu.edu/yyin/teach/pbb/purdue.cellwall.list.lignin.f a.aln as input and

More information

Phylogeny: building the tree of life

Phylogeny: building the tree of life Phylogeny: building the tree of life Dr. Fayyaz ul Amir Afsar Minhas Department of Computer and Information Sciences Pakistan Institute of Engineering & Applied Sciences PO Nilore, Islamabad, Pakistan

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: istance-based methods Ultrametric Additive: UPGMA Transformed istance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline Phylogenetics Todd Vision iology 522 March 26, 2007 pplications of phylogenetics Studying organismal or biogeographic history Systematics ating events in the fossil record onservation biology Studying

More information

Algorithms in Bioinformatics

Algorithms in Bioinformatics Algorithms in Bioinformatics Sami Khuri Department of Computer Science San José State University San José, California, USA khuri@cs.sjsu.edu www.cs.sjsu.edu/faculty/khuri Distance Methods Character Methods

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Constructing Evolutionary/Phylogenetic Trees

Constructing Evolutionary/Phylogenetic Trees Constructing Evolutionary/Phylogenetic Trees 2 broad categories: Distance-based methods Ultrametric Additive: UPGMA Transformed Distance Neighbor-Joining Character-based Maximum Parsimony Maximum Likelihood

More information

Evolutionary Tree Analysis. Overview

Evolutionary Tree Analysis. Overview CSI/BINF 5330 Evolutionary Tree Analysis Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Backgrounds Distance-Based Evolutionary Tree Reconstruction Character-Based

More information

A (short) introduction to phylogenetics

A (short) introduction to phylogenetics A (short) introduction to phylogenetics Thibaut Jombart, Marie-Pauline Beugin MRC Centre for Outbreak Analysis and Modelling Imperial College London Genetic data analysis with PR Statistics, Millport Field

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016 Molecular phylogeny - Using molecular sequences to infer evolutionary relationships Tore Samuelsson Feb 2016 Molecular phylogeny is being used in the identification and characterization of new pathogens,

More information

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010

BLAST Database Searching. BME 110: CompBio Tools Todd Lowe April 8, 2010 BLAST Database Searching BME 110: CompBio Tools Todd Lowe April 8, 2010 Admin Reading: Read chapter 7, and the NCBI Blast Guide and tutorial http://www.ncbi.nlm.nih.gov/blast/why.shtml Read Chapter 8 for

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS

A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS A PARSIMONY APPROACH TO ANALYSIS OF HUMAN SEGMENTAL DUPLICATIONS CRYSTAL L. KAHN and BENJAMIN J. RAPHAEL Box 1910, Brown University Department of Computer Science & Center for Computational Molecular Biology

More information

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law

Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Divergence Pattern of Duplicate Genes in Protein-Protein Interactions Follows the Power Law Ze Zhang,* Z. W. Luo,* Hirohisa Kishino,à and Mike J. Kearsey *School of Biosciences, University of Birmingham,

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Why are these disciplines important in evolutionary biology and how are they related to each other? Phylogeny and systematics Phylogeny: the evolutionary history of a species

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Phylogenetic inference: from sequences to trees

Phylogenetic inference: from sequences to trees W ESTFÄLISCHE W ESTFÄLISCHE W ILHELMS -U NIVERSITÄT NIVERSITÄT WILHELMS-U ÜNSTER MM ÜNSTER VOLUTIONARY FUNCTIONAL UNCTIONAL GENOMICS ENOMICS EVOLUTIONARY Bioinformatics 1 Phylogenetic inference: from sequences

More information

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

C.DARWIN ( )

C.DARWIN ( ) C.DARWIN (1809-1882) LAMARCK Each evolutionary lineage has evolved, transforming itself, from a ancestor appeared by spontaneous generation DARWIN All organisms are historically interconnected. Their relationships

More information

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family Review: Gene Families Gene Families part 2 03 327/727 Lecture 8 What is a Case study: ian globin genes Gene trees and how they differ from species trees Homology, orthology, and paralogy Last tuesday 1

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on:

17 Non-collinear alignment Motivation A B C A B C A B C A B C D A C. This exposition is based on: 17 Non-collinear alignment This exposition is based on: 1. Darling, A.E., Mau, B., Perna, N.T. (2010) progressivemauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One 5(6):e11147.

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz Phylogenetic Trees What They Are Why We Do It & How To Do It Presented by Amy Harris Dr Brad Morantz Overview What is a phylogenetic tree Why do we do it How do we do it Methods and programs Parallels

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

Lecture 11 Friday, October 21, 2011

Lecture 11 Friday, October 21, 2011 Lecture 11 Friday, October 21, 2011 Phylogenetic tree (phylogeny) Darwin and classification: In the Origin, Darwin said that descent from a common ancestral species could explain why the Linnaean system

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

Phylogenetics in the Age of Genomics: Prospects and Challenges

Phylogenetics in the Age of Genomics: Prospects and Challenges Phylogenetics in the Age of Genomics: Prospects and Challenges Antonis Rokas Department of Biological Sciences, Vanderbilt University http://as.vanderbilt.edu/rokaslab http://pubmed2wordle.appspot.com/

More information

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D

Michael Yaffe Lecture #5 (((A,B)C)D) Database Searching & Molecular Phylogenetics A B C D B C D 7.91 Lecture #5 Database Searching & Molecular Phylogenetics Michael Yaffe B C D B C D (((,B)C)D) Outline Distance Matrix Methods Neighbor-Joining Method and Related Neighbor Methods Maximum Likelihood

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

Introduction to Bioinformatics Introduction to Bioinformatics

Introduction to Bioinformatics Introduction to Bioinformatics Dr. rer. nat. Gong Jing Cancer Research Center Medicine School of Shandong University 2012.11.09 1 Chapter 4 Phylogenetic Tree 2 Phylogeny Evidence from morphological ( 形态学的 ), biochemical, and gene sequence

More information

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega BLAST Multiple Sequence Alignments: Clustal Omega What does basic BLAST do (e.g. what is input sequence and how does BLAST look for matches?) Susan Parrish McDaniel College Multiple Sequence Alignments

More information

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.

More information

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5. Five Sami Khuri Department of Computer Science San José State University San José, California, USA sami.khuri@sjsu.edu v Distance Methods v Character Methods v Molecular Clock v UPGMA v Maximum Parsimony

More information

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between

More information

Intraspecific gene genealogies: trees grafting into networks

Intraspecific gene genealogies: trees grafting into networks Intraspecific gene genealogies: trees grafting into networks by David Posada & Keith A. Crandall Kessy Abarenkov Tartu, 2004 Article describes: Population genetics principles Intraspecific genetic variation

More information

Session 5: Phylogenomics

Session 5: Phylogenomics Session 5: Phylogenomics B.- Phylogeny based orthology assignment REMINDER: Gene tree reconstruction is divided in three steps: homology search, multiple sequence alignment and model selection plus tree

More information

Comparative genomics. Lucy Skrabanek ICB, WMC 6 May 2008

Comparative genomics. Lucy Skrabanek ICB, WMC 6 May 2008 Comparative genomics Lucy Skrabanek ICB, WMC 6 May 2008 What does it encompass? Genome conservation transfer knowledge gained from model organisms to non-model organisms Genome evolution understand how

More information

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)

More information

Evolutionary Models. Evolutionary Models

Evolutionary Models. Evolutionary Models Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment

More information

Using algebraic geometry for phylogenetic reconstruction

Using algebraic geometry for phylogenetic reconstruction Using algebraic geometry for phylogenetic reconstruction Marta Casanellas i Rius (joint work with Jesús Fernández-Sánchez) Departament de Matemàtica Aplicada I Universitat Politècnica de Catalunya IMA

More information

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre PhD defense Chromosomal rearrangements in mammalian genomes : characterising the breakpoints Claire Lemaitre Laboratoire de Biométrie et Biologie Évolutive Université Claude Bernard Lyon 1 6 novembre 2008

More information

Introduction to Bioinformatics Online Course: IBT

Introduction to Bioinformatics Online Course: IBT Introduction to Bioinformatics Online Course: IBT Multiple Sequence Alignment Building Multiple Sequence Alignment Lec1 Building a Multiple Sequence Alignment Learning Outcomes 1- Understanding Why multiple

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Multiple Sequence Alignment. Sequences

Multiple Sequence Alignment. Sequences Multiple Sequence Alignment Sequences > YOR020c mstllksaksivplmdrvlvqrikaqaktasglylpe knveklnqaevvavgpgftdangnkvvpqvkvgdqvl ipqfggstiklgnddevilfrdaeilakiakd > crassa mattvrsvksliplldrvlvqrvkaeaktasgiflpe

More information

Computational analyses of ancient polyploidy

Computational analyses of ancient polyploidy Computational analyses of ancient polyploidy Kevin P. Byrne 1 and Guillaume Blanc 2* 1 Department of Genetics, Smurfit Institute, University of Dublin, Trinity College, Dublin 2, Ireland. 2 Laboratoire

More information

A Phylogenetic Network Construction due to Constrained Recombination

A Phylogenetic Network Construction due to Constrained Recombination A Phylogenetic Network Construction due to Constrained Recombination Mohd. Abdul Hai Zahid Research Scholar Research Supervisors: Dr. R.C. Joshi Dr. Ankush Mittal Department of Electronics and Computer

More information

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family

A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family A bioinformatics approach to the structural and functional analysis of the glycogen phosphorylase protein family Jieming Shen 1,2 and Hugh B. Nicholas, Jr. 3 1 Bioengineering and Bioinformatics Summer

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

Lecture Notes: BIOL2007 Molecular Evolution

Lecture Notes: BIOL2007 Molecular Evolution Lecture Notes: BIOL2007 Molecular Evolution Kanchon Dasmahapatra (k.dasmahapatra@ucl.ac.uk) Introduction By now we all are familiar and understand, or think we understand, how evolution works on traits

More information

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Phylogenetic Analysis Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center Outline Basic Concepts Tree Construction Methods Distance-based methods

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Toni Gabaldón Contact: tgabaldon@crg.es Group website: http://gabaldonlab.crg.es Science blog: http://treevolution.blogspot.com

More information

How to read and make phylogenetic trees Zuzana Starostová

How to read and make phylogenetic trees Zuzana Starostová How to read and make phylogenetic trees Zuzana Starostová How to make phylogenetic trees? Workflow: obtain DNA sequence quality check sequence alignment calculating genetic distances phylogeny estimation

More information

Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley

Integrative Biology 200 PRINCIPLES OF PHYLOGENETICS Spring 2018 University of California, Berkeley Integrative Biology 200 "PRINCIPLES OF PHYLOGENETICS" Spring 2018 University of California, Berkeley B.D. Mishler Feb. 14, 2018. Phylogenetic trees VI: Dating in the 21st century: clocks, & calibrations;

More information

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona (tgabaldon@crg.es) http://gabaldonlab.crg.es Homology the same organ in different animals under

More information

Molecular Evolution, course # Final Exam, May 3, 2006

Molecular Evolution, course # Final Exam, May 3, 2006 Molecular Evolution, course #27615 Final Exam, May 3, 2006 This exam includes a total of 12 problems on 7 pages (including this cover page). The maximum number of points obtainable is 150, and at least

More information

Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human

Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human Phylogenetic Reconstruction of Orthology, Paralogy, and Conserved Synteny for Dog and Human Leo Goodstadt *, Chris P. Ponting Medical Research Council Functional Genetics Unit, University of Oxford, Department

More information

FUNDAMENTALS OF MOLECULAR EVOLUTION

FUNDAMENTALS OF MOLECULAR EVOLUTION FUNDAMENTALS OF MOLECULAR EVOLUTION Second Edition Dan Graur TELAVIV UNIVERSITY Wen-Hsiung Li UNIVERSITY OF CHICAGO SINAUER ASSOCIATES, INC., Publishers Sunderland, Massachusetts Contents Preface xiii

More information

Comparative Bioinformatics Midterm II Fall 2004

Comparative Bioinformatics Midterm II Fall 2004 Comparative Bioinformatics Midterm II Fall 2004 Objective Answer, part I: For each of the following, select the single best answer or completion of the phrase. (3 points each) 1. Deinococcus radiodurans

More information

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES

USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES USING BLAST TO IDENTIFY PROTEINS THAT ARE EVOLUTIONARILY RELATED ACROSS SPECIES HOW CAN BIOINFORMATICS BE USED AS A TOOL TO DETERMINE EVOLUTIONARY RELATIONSHPS AND TO BETTER UNDERSTAND PROTEIN HERITAGE?

More information

O 3 O 4 O 5. q 3. q 4. Transition

O 3 O 4 O 5. q 3. q 4. Transition Hidden Markov Models Hidden Markov models (HMM) were developed in the early part of the 1970 s and at that time mostly applied in the area of computerized speech recognition. They are first described in

More information

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees Erin Molloy and Tandy Warnow {emolloy2, warnow}@illinois.edu University of Illinois at Urbana

More information

Homology Modeling. Roberto Lins EPFL - summer semester 2005

Homology Modeling. Roberto Lins EPFL - summer semester 2005 Homology Modeling Roberto Lins EPFL - summer semester 2005 Disclaimer: course material is mainly taken from: P.E. Bourne & H Weissig, Structural Bioinformatics; C.A. Orengo, D.T. Jones & J.M. Thornton,

More information

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics Bioinformatics 1 Biology, Sequences, Phylogenetics Part 4 Sepp Hochreiter Klausur Mo. 30.01.2011 Zeit: 15:30 17:00 Raum: HS14 Anmeldung Kusss Contents Methods and Bootstrapping of Maximum Methods Methods

More information

Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30

Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30 Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 7 (Bootstraps, etc.) p.1/30 A non-phylogeny

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

Consensus Methods. * You are only responsible for the first two

Consensus Methods. * You are only responsible for the first two Consensus Trees * consensus trees reconcile clades from different trees * consensus is a conservative estimate of phylogeny that emphasizes points of agreement * philosophy: agreement among data sets is

More information

Computational methods for predicting protein-protein interactions

Computational methods for predicting protein-protein interactions Computational methods for predicting protein-protein interactions Tomi Peltola T-61.6070 Special course in bioinformatics I 3.4.2008 Outline Biological background Protein-protein interactions Computational

More information

Phylogenomics of closely related species and individuals

Phylogenomics of closely related species and individuals Phylogenomics of closely related species and individuals Matthew Rasmussen Siepel lab, Cornell University In collaboration with Manolis Kellis, MIT CSAIL February, 2013 Short time scales 1kyr-1myrs Long

More information

Bootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057

Bootstrapping and Tree reliability. Biol4230 Tues, March 13, 2018 Bill Pearson Pinn 6-057 Bootstrapping and Tree reliability Biol4230 Tues, March 13, 2018 Bill Pearson wrp@virginia.edu 4-2818 Pinn 6-057 Rooting trees (outgroups) Bootstrapping given a set of sequences sample positions randomly,

More information

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0

More information