Comparative Genomics. Dept. of Computer Science Comenius University in Bratislava, Slovakia

Size: px
Start display at page:

Download "Comparative Genomics. Dept. of Computer Science Comenius University in Bratislava, Slovakia"

Transcription

1 Comparative Genomics Broňa Brejová Dept. of Computer Science Comenius University in Bratislava, Slovakia 1

2 2

3 Why to sequence so many genomes? 3

4 Comparative genomics Compare genomic sequences of multiple related species find similarities and differences substitutions, indels, genome rearrangements and duplications Explore evolutionary processes neutral mutations vs. positive / negative selection Find functional regions (genes, regulatory regions etc.) often characterized by negative (purifying) selection Look for differences explaining different phenotypes, e.g. human versus other primates domesticated animals/plants vs. wild counterparts pathogenic species vs. free-living relatives adaptations to different environments and diets 4

5 Gene family evolution duplication HISTORY: GENE TREE: speciation A1 A2 A3 B1 B2 B3 speciation duplication loss SPECIES TREE: A1 A2 B1 B2 A3 B3 species 1 species 2 species 3 species 1 species 2 species 3 Homolog: shared evolutionary origin Ortholog: closest common ancestor is a speciation node (e.g.. A1/A3) Paralog: closest common ancestor is a duplication node (e.g. B1/B2, A1/B1, B1/B3, B2/B3) 5

6 Gene tree / species tree reconciliation Given species and gene tree, infer history Favor histories with fewer events (parsimony) Gene tree inferred from gene sequences, may contain errors GENE TREE: HISTORY: A1 A2 A3 B1 B2 B3 SPECIES TREE: species 1 species 2 species 3 A1 A2 B1 B2 A3 B3 species 1 species 2 species 3 6

7 Global view of gene family evolution Do not infer history for each family instead assume a global evolutionary model of family size change Find genes in genomes Assign them to families (by sequence similarity) Summarize counts for each family and each species Infer a species tree Infer overall rateλof gene gain/loss E.g. in yeasts gains and losses/gene/million years Look for families with significantly accelerated evolution [Hahn et al. 2005] 7

8 Stochastic model of gene family evolution Simplified view of how evolution might operate Imagine generating simulated data Birth and death process for gene families species 1 species 2 species 3 8

9 Stochastic model of gene family evolution Each gene can duplicate or be lost at a rateλ For a short timetwe expect2λtn events in ann-gene family For longertsome genes can be affected multiple times P(X t = c X 0 = s) = min(s,c) j=0 whereα = λt/(1+λt) Using these formulas, we can compute ( s j)( s+c j+1 s 1 probability of a history 1 3 We can also compute probability (likelihood) ) α s+c 2j (1 2α) j of observing counts in the current species We can find value ofλmaximizing probability over all gene families

10 Results on five yeast genomes [Hahn et al. 2005] Rate λ of gene gain/loss is gains and losses/gene/million years 1254 out of 3517 gene families some change in size Look for families with significantly accelerated evolution Stress response family: S.cer. S.par. S.mik. S.kud. S.bay. 10

11 Whole-genome alignments For each region of a reference genome (e.g. human) find and align corresponding parts from other genomes Human AGTGGCTGCCAGGCTG---GGATGCTGAGGCCTTGTTTGCAGGGAGGT Rhesus AGTGGCTGCCAGGCTG---GGTTGCTGAGGCCTTGTTTGCCGGGAGGT Mouse GGTGGCTGCCGGGCTG---GGTGGCTGAGGCCTTGTTGGTGGGGTGGT Dog AGTGGCTGCCCGGCTG---GGTGGCTGAGGCCTTATTTGCAGGGAGGT Horse GATGGCTGCCGGGCTG---GGCTGCCGAGGCCTTGTTCGTGGGGAGGT Armadillo AGTGGCTGCCGGGCTG---GGAGGCCAAGGCCTTGTTCGCGGGCAGGT Chicken AGTGGCTGCCAGTCTGCGCCGTGGCCGACGTCTTGCTCGGGGGAAGGT X. tropicalis AATGGCTTCCATTTTGTGCCGCTGCTGAGGTCTTGTTCTGGGGAAGAT 11

12 12

13 Nets and chains from the UCSC genome browser For each region of a reference genome (e.g. human) find and align corresponding parts from other genomes (e.g. mouse, dog, chicken, etc.) Local alignments align exons and other conserved elements, but many parts of genomes have changed too much For duplicated regions decide which pairs are orthologs This can be done using synteny: a chain of local alignments in the same order and orientation in the two genomes 13

14 Nets and chains from the UCSC genome browser Start with local alignments Connect them to chains, where we allow big indels and unaligned regions, but require the same order and orientation in both genomes Selects some chains to form a hierarchical net: choose chains with highest score that do not overlap in reference We can use only parts of some chains Parts of non-reference genome can be used more than once (if reference duplicated) 14

15 Scale chr13: chr k Level 1 Level 2 Level 3 Level 4 Level 5 Level 6 2 kb Mouse Chained Alignments chr k chr k chr k chr k chr k Mouse Alignment Net 15

16 Negative (purifying) selection Important parts of a genome accumulate mutations more slowly Find conserved elements in genomes Many correspond to known functional elements (genes, regulation) Conserved elements not overlapping these are interesting targets for future research SC Genes Based on RefSeq, UniProt, GenBank, CCDS and Comparative Genom UCSC Genes 1 _ Placental Mammal Conservation by PhastCons Mammal Cons 0 _ Multiz Alignments of 46 Vertebrates Gaps 2 Human C A AGA CGAGA C AGG T A A A T C T C A T GAGC T T T A T T C T A T A T T T Chimp C A AGA CGAGA C AGG T A A A T C T C A T GAGC T T T A T T C T A T A T T T Mouse C A AGGCGGGA C AGG T GAGCC T CC T GCGC T GCGC T C T C T GC T T Dog C A AGGCGAGA C AGG T A A AGC T C A T GAGA T T T A T T C T A T A T T T Chicken C A AGGCGAGA C AGG T A A T T C T T A T GAGA T T T CGA C T G T A C T T 16

17 Substitution models Jukes-Cantor model: basexmutates to some other basey at rateα(rate the same for allx Y) Probability of change fromatocover timet: Pr(X t = C X 0 = A) = 1 4 (1 e 4 3 αt ) Includes possibility of multiple mutations C We can compute probability C of a history A A C We can also compute likelihood given only current sequences A A C We can estimate bestαfor a given alignment 17

18 More complex substitution models Jukes-Cantor model assumes each mutation equally likely In general, substitution rateµ xy from basexto basey Substitution rate matrix µ A µ AC µ AG µ AT µ CA µ C µ CG µ CT µ GA µ GC µ G µ GT µ TA µ TC µ TG µ T Pr(X t = C X 0 = A) does not have in general closed formula but can be computed by algebraic methods Equilibrium frequenciesπ A,π C,π G,π T stay stable in the model 18

19 HKY model [Hasegawa, Kishino a Yano 1985] A lower number of parameters µ A βπ C απ G βπ T βπ A µ C βπ G απ T απ A βπ C µ G βπ T βπ A απ C βπ G µ T Transition rateα:c T,A G Transversion rateβ:{c,t} {A,G} Five parameters:π A,π C,π G,α,κ = α/β 19

20 Back to conserved elements Basic idea: infer two ratesα c for slow evolving andα n for fast evolving alignment columns For each column try to determine, which rate more likely But: one column may look conserved purely by chance (not enough information) Combine information from a short window or use a phylogenetic hidden Markov model (phylohmm) 20

21 PhastCons: detection of conserved elements using phylohmms 21

22 PhastCons results Whole-genome alignments of human, mouse, chicken and fugu 22

23 Conserved elements in 29 mammals [Lindblad-Toh et al. 2011] Four binding sites of NRSF transcription factor 23

24 Comparative gene finding Improve genome annotation using whole-genome alignments Look for specific signatures typical for genes (synonymous substitutions, indels preserving reading frame) Lin et al

25 Comparative gene finding Comparative genomics also helps to find special cases e.g. stop codon readthrough selenoproteins (UGA to selenocysteine) RNA editing (adenine na inosine) Lin et al

26 Human Accelerated Regions [Pollard et al 2006] We are looking for genomic regions which: were mutating slowly for a long time (negative selection) in human lineage they change very fast (positive selection) Details: Consider regions of length 100 with 96% sequence identity between chimpanzee and mouse/rat (35,000) Compare with other mammals, select those that have many mutations in human and few elsewhere Probabilistic model which allows scaling of human branch 49 statistically significant regions, 96% of them non-coding 26

27 Human Accelerated Regions: HAR1 Region of length 118 bases 300 mil. years 18 changes between human and chimpanzee medzi 2 changes between chimpanzee and chicken 6 mil. years Clovek C T G A A A T G A T G G G C G T A G A C G C A C G T C A G C G G C G G A A A T G G T T T C T A T Simpanz C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C A G T G G A A A T A G T T T C T A T Gorila C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C A G T G G A A A T A G T T T C T A T Rezus C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C A G T G G A A A T A G T T T C T A T Mys C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C C G T G G A A A T G G T T T C T A T Krava C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C A G T G G A A A C C G T T T C T A T Pes C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C G G T G C A A A C A G T T T C T A T Sliepka C T G A A A T T A T A G G T G T A G A C A C A T G T C A G C A G T A G A A A C A G T T T C T A T 27

28 What is the function of HAR1? Overlaps RNA genes HAR1R and HAR1F HAR1F is expressed in neocortex in 7 and 9 week old embryos, later also in other parts of the brain (in human and other primates) 28

29 What is the function of HAR1? Mutations change RNA structure 29

30 Functional enrichment Results of whole-genome studies often in the form of a list of significant genes In comparative genomics e.g. families with accelerated gain and loss, human accelerated regions, genes under positive selection etc. Also from other studes, e.g. differential expression analysis How to use such lists? look manually at the most significant candidates try to find common characteristics of the whole set 30

31 Gene ontology Hierarchical structure of biological terms describing functions of genes GO: biological process GO: localization GO: establishment of localization GO: transport GO: ion transport GO: ion transmembrane transport Databases contain gene ontology terms for many proteins Is some function enriched in our gene set? 31

32 Example [Kosiol et al 2007] n = genes overall n i = 70 genes with innate immune response term (0.4% of all genes) n p = 400 genes with positive selection overall n ip = 8 of them innate immune response (2% of genes with pos.sel.) Contingency table Pos.sel. No pos.sel. Total Immunity 8 (n ip ) (n i ) Other Total 400 (n p ) (n) 32

33 Null hypothesis Genes in our list were randomly selected from all genes Whole genome hasn i /n = 0.4% immunity genes Our list should contain aboutn p (n i /n) immunity genes We expect 1.7 genes, get 8 genes But purely by chance the number can be larger or smaller Urn withn i white andn n i black balls Randomly selectn p balls, how many are white?x ip Hypergeometric distribution Pr(X ip = n ip ) = ( ni n ip )( n ni n p n ip ) ( ) n / n p P-value:P(X ip 8) =

34 Our research: ancestral gene orders in mitochondrial genomes [Valach et al. NAR, 2011, Kovac, Brejova, Vinar WABI 2011] 4 0 (0-1) 2 (0-2) 1 (1-3) 5 (4-5) 2 (0-2) 4 (3-4) 0 (0-1) C. parapsilosis 2 (1-3) 1 (0-1) C. orthopsilosis 0 0 (0-1) C. orthopsilosis 1 (0-1) C. jiufengensis 9 (8-9) L. elongisporus 2 (0-2) C. tropicalis 2 (2-3) 5 (5-7) C. sojae 0 1 (0-1) C. viswanathii 2 C. frijolesensis 1 0 C. neerlandica 5 (5-6) C. albicans 11 (11-12) C. maltosa 5 (4-5) C. alai 2 (2-3) 3 C. subhashii 5 (4-5) D. hansenii 3 (3-4) P. sorbitophila nad3 nad2 cob cox2 rnl cox1 nad4 rns atp9 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob cox2 rnl cox1 nad4 rns atp9 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob cox2 rnl cox1 nad4 rns atp9 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob cox2 rnl cox1 nad4 rns atp9 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob cox2 rnl cox1 nad4 rns atp9 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob atp9 rns nad4 cox1 rnl cox2 nad6 nad1 cox3 nad4l nad5 atp8 atp6 cox1 nad4 rns atp9 cob nad2 nad3 rnl cox2 nad6 nad1 cox3 nad4l nad5 atp8 atp6 nad3 nad2 cob atp9 rns nad4 cox1 rnl cox2 nad6 nad1 cox3 nad4l nad5 atp8 atp6 cob rnl cox2 nad6 nad1 nad2 nad3 nad4 rns atp9 cox1 atp8 atp6 cox3 nad5 nad4l nad3 nad2 cob atp9 rns nad4 cox1 rnl cox2 nad6 nad1 cox3 atp8 atp6 nad5 nad4l rns atp9 nad2 nad3 cob cox1 rnl cox2 nad6 nad1 cox3 atp6 atp8 nad4 nad4l nad5 nad3 nad2 atp9 rns cob cox3 nad1 nad6 cox2 rnl cox1 nad5 nad4l nad4 atp8 atp6 cox1 nad3 nad2 atp9 rns cob cox2 rnl nad6 nad1 cox2 cob rns atp9 nad2 nad3 nad4 cox3 nad3 nad2 atp9 rns cob cox1 rnl cox2 nad6 nad1 cox3 atp6 atp8 nad4 nad4l nad5 cox3 cox1 cob rns atp9 nad2 nad3 nad5 nad4l nad4 atp8 atp6 rnl cox2 nad6 nad1 nad3 nad2 atp9 rns cob cox1 rnl cox2 nad6 nad1 cox3 atp6 atp8 nad4 nad4l nad5 rnl cox2 nad6 nad1 cox3 cox1 cob rns atp9 nad2 nad3 nad5 nad4l nad4 atp8 atp6 nad3 nad2 atp9 rns cob cox1 rnl cox2 nad6 nad1 cox3 atp6 atp8 nad4 nad4l nad5 nad3 nad2 atp9 rns cob cox1 rnl cox2 nad6 nad1 cox3 atp6 atp8 nad4 nad4l nad5 cox1 cob rns atp9 nad2 nad3 cox3 rnl cox2 nad6 nad1 nad4l nad5 nad4 atp6 atp8 cox1 nad2 nad3 cob rns nad4l nad5 nad4 cox3 rnl cox2 nad6 nad1 cox3 atp9 atp6 atp8 cox1 nad4 cob rns atp9 nad2 nad3 cox3 rnl cox2 nad6 nad1 nad4l nad5 atp6 atp8 nad2 nad3 cox1 nad4 nad1 nad6 atp6 atp8 cox2 rnl cob nad5 nad4l rns atp9 cox3 cox2 rnl cox1 nad4 rns atp9 cob nad2 nad3 atp8 atp6 nad5 nad4l cox3 nad1 nad6 cob nad1 nad6 cox2 nad4 rnl cox3 cox1 atp6 atp8 atp9 rns nad4l nad5 nad2 nad3 atp8 a nad3 nad2 nad5 nad4l rnl cox2 nad6 nad1 cox3 cox1 nad4 rns atp9 cob atp6 atp8 cob atp9 rns nad4 nad1 nad6 cox2 rnl nad4l nad5 cox1 cox3 nad2 nad3 atp8 atp6 cox3 cob nad3 nad2 nad1 nad6 cox2 rnl rns nad4 cox1 atp9 atp8 atp6 nad4l nad5 rns nad4 cox1 rnl cox2 nad6 nad1 atp9 atp8 atp6 nad2 nad3 cob cox3 nad4l nad5 nad2 nad3 nad4 rns cob cox3 nad4l nad5 cox1 rnl cox2 nad6 nad1 atp9 atp8 atp6 34

35 Our research: history inference for duplicated gene clusters [Vinar, Brejova, Song, Siepel. JCB 2010] human IFN cluster, chr 9 35

36 Conclusion Comparative genomics can help us annotate genomes and study their evolution Evolution typically characterized by stochastic models We can estimate parameters of these models from data to study typical patterns of evolution We can also detect atypical elements Gene family evolution, conserved elements, accelerated elements Next: positive selection in protein coding genes 36

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM)

Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Bioinformatics II Probability and Statistics Universität Zürich and ETH Zürich Spring Semester 2009 Lecture 4: Evolutionary Models and Substitution Matrices (PAM and BLOSUM) Dr Fraser Daly adapted from

More information

Understanding relationship between homologous sequences

Understanding relationship between homologous sequences Molecular Evolution Molecular Evolution How and when were genes and proteins created? How old is a gene? How can we calculate the age of a gene? How did the gene evolve to the present form? What selective

More information

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky MOLECULAR PHYLOGENY "Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky EVOLUTION - theory that groups of organisms change over time so that descendeants differ structurally

More information

EVOLUTIONARY DISTANCES

EVOLUTIONARY DISTANCES EVOLUTIONARY DISTANCES FROM STRINGS TO TREES Luca Bortolussi 1 1 Dipartimento di Matematica ed Informatica Università degli studi di Trieste luca@dmi.units.it Trieste, 14 th November 2007 OUTLINE 1 STRINGS:

More information

Phylogenetic Tree Reconstruction

Phylogenetic Tree Reconstruction I519 Introduction to Bioinformatics, 2011 Phylogenetic Tree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & Computing, IUB Evolution theory Speciation Evolution of new organisms is driven

More information

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University, Bloomington

More information

Evolution by duplication

Evolution by duplication 6.095/6.895 - Computational Biology: Genomes, Networks, Evolution Lecture 18 Nov 10, 2005 Evolution by duplication Somewhere, something went wrong Challenges in Computational Biology 4 Genome Assembly

More information

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM

3/1/17. Content. TWINSCAN model. Example. TWINSCAN algorithm. HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM I529: Machine Learning in Bioinformatics (Spring 2017) Content HMM for modeling aligned multiple sequences: phylo-hmm & multivariate HMM Yuzhen Ye School of Informatics and Computing Indiana University,

More information

Bio 1B Lecture Outline (please print and bring along) Fall, 2007

Bio 1B Lecture Outline (please print and bring along) Fall, 2007 Bio 1B Lecture Outline (please print and bring along) Fall, 2007 B.D. Mishler, Dept. of Integrative Biology 2-6810, bmishler@berkeley.edu Evolution lecture #5 -- Molecular genetics and molecular evolution

More information

Dr. Amira A. AL-Hosary

Dr. Amira A. AL-Hosary Phylogenetic analysis Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic Basics: Biological

More information

Adaptive Evolution of Conserved Noncoding Elements in Mammals

Adaptive Evolution of Conserved Noncoding Elements in Mammals Adaptive Evolution of Conserved Noncoding Elements in Mammals Su Yeon Kim 1*, Jonathan K. Pritchard 2* 1 Department of Statistics, The University of Chicago, Chicago, Illinois, United States of America,

More information

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut University-Egypt Phylogenetic analysis Phylogenetic Basics: Biological

More information

The Phylo- HMM approach to problems in comparative genomics, with examples.

The Phylo- HMM approach to problems in comparative genomics, with examples. The Phylo- HMM approach to problems in comparative genomics, with examples. Keith Bettinger Introduction The theory of evolution explains the diversity of organisms on Earth by positing that earlier species

More information

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein

Gene Ontology and Functional Enrichment. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Gene Ontology and Functional Enrichment Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein The parsimony principle: A quick review Find the tree that requires the fewest

More information

Lecture Notes: Markov chains

Lecture Notes: Markov chains Computational Genomics and Molecular Biology, Fall 5 Lecture Notes: Markov chains Dannie Durand At the beginning of the semester, we introduced two simple scoring functions for pairwise alignments: a similarity

More information

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM).

Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). 1 Bioinformatics: In-depth PROBABILITY & STATISTICS Spring Semester 2011 University of Zürich and ETH Zürich Lecture 4: Evolutionary models and substitution matrices (PAM and BLOSUM). Dr. Stefanie Muff

More information

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly

Drosophila melanogaster and D. simulans, two fruit fly species that are nearly Comparative Genomics: Human versus chimpanzee 1. Introduction The chimpanzee is the closest living relative to humans. The two species are nearly identical in DNA sequence (>98% identity), yet vastly different

More information

Phylogenetic Assumptions

Phylogenetic Assumptions Substitution Models and the Phylogenetic Assumptions Vivek Jayaswal Lars S. Jermiin COMMONWEALTH OF AUSTRALIA Copyright htregulation WARNING This material has been reproduced and communicated to you by

More information

BINF6201/8201. Molecular phylogenetic methods

BINF6201/8201. Molecular phylogenetic methods BINF60/80 Molecular phylogenetic methods 0-7-06 Phylogenetics Ø According to the evolutionary theory, all life forms on this planet are related to one another by descent. Ø Traditionally, phylogenetics

More information

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço jcarrico@fm.ul.pt Charles Darwin (1809-1882) Charles Darwin s tree of life in Notebook B, 1837-1838 Ernst Haeckel (1934-1919)

More information

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family

Gene Families part 2. Review: Gene Families /727 Lecture 8. Protein family. (Multi)gene family Review: Gene Families Gene Families part 2 03 327/727 Lecture 8 What is a Case study: ian globin genes Gene trees and how they differ from species trees Homology, orthology, and paralogy Last tuesday 1

More information

7. Tests for selection

7. Tests for selection Sequence analysis and genomics 7. Tests for selection Dr. Katja Nowick Group leader TFome and Transcriptome Evolution Bioinformatics group Paul-Flechsig-Institute for Brain Research www. nowicklab.info

More information

Computational Identification of Evolutionarily Conserved Exons

Computational Identification of Evolutionarily Conserved Exons Computational Identification of Evolutionarily Conserved Exons Adam Siepel Center for Biomolecular Science and Engr. University of California Santa Cruz, CA 95064, USA acs@soe.ucsc.edu David Haussler Howard

More information

Comparative Genomics. Chapter for Human Genetics - Principles and Approaches - 4 th Edition

Comparative Genomics. Chapter for Human Genetics - Principles and Approaches - 4 th Edition Chapter for Human Genetics - Principles and Approaches - 4 th Edition Editors: Friedrich Vogel, Arno Motulsky, Stylianos Antonarakis, and Michael Speicher Comparative Genomics Ross C. Hardison Affiliations:

More information

Practical considerations of working with sequencing data

Practical considerations of working with sequencing data Practical considerations of working with sequencing data File Types Fastq ->aligner -> reference(genome) coordinates Coordinate files SAM/BAM most complete, contains all of the info in fastq and more!

More information

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26

Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/26 Lecture 27. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 27. Phylogeny methods, part 4 (Models of DNA and

More information

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment

Sequence Analysis 17: lecture 5. Substitution matrices Multiple sequence alignment Sequence Analysis 17: lecture 5 Substitution matrices Multiple sequence alignment Substitution matrices Used to score aligned positions, usually of amino acids. Expressed as the log-likelihood ratio of

More information

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task.

METHODS FOR DETERMINING PHYLOGENY. In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Chapter 12 (Strikberger) Molecular Phylogenies and Evolution METHODS FOR DETERMINING PHYLOGENY In Chapter 11, we discovered that classifying organisms into groups was, and still is, a difficult task. Modern

More information

Phylogenetic inference

Phylogenetic inference Phylogenetic inference Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, March 7 th 016 After this lecture, you can discuss (dis-) advantages of different information types

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 2009 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Molecular phylogeny How to infer phylogenetic trees using molecular sequences

Molecular phylogeny How to infer phylogenetic trees using molecular sequences Molecular phylogeny How to infer phylogenetic trees using molecular sequences ore Samuelsson Nov 200 Applications of phylogenetic methods Reconstruction of evolutionary history / Resolving taxonomy issues

More information

Example of Function Prediction

Example of Function Prediction Find similar genes Example of Function Prediction Suggesting functions of newly identified genes It was known that mutations of NF1 are associated with inherited disease neurofibromatosis 1; but little

More information

Alignment Algorithms. Alignment Algorithms

Alignment Algorithms. Alignment Algorithms Midterm Results Big improvement over scores from the previous two years. Since this class grade is based on the previous years curve, that means this class will get higher grades than the previous years.

More information

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression)

Using phylogenetics to estimate species divergence times... Basics and basic issues for Bayesian inference of divergence times (plus some digression) Using phylogenetics to estimate species divergence times... More accurately... Basics and basic issues for Bayesian inference of divergence times (plus some digression) "A comparison of the structures

More information

CHAPTERS 24-25: Evidence for Evolution and Phylogeny

CHAPTERS 24-25: Evidence for Evolution and Phylogeny CHAPTERS 24-25: Evidence for Evolution and Phylogeny 1. For each of the following, indicate how it is used as evidence of evolution by natural selection or shown as an evolutionary trend: a. Paleontology

More information

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships Chapter 26: Phylogeny and the Tree of Life You Must Know The taxonomic categories and how they indicate relatedness. How systematics is used to develop phylogenetic trees. How to construct a phylogenetic

More information

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution

Massachusetts Institute of Technology Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution Massachusetts Institute of Technology 6.877 Computational Evolutionary Biology, Fall, 2005 Notes for November 7: Molecular evolution 1. Rates of amino acid replacement The initial motivation for the neutral

More information

Taming the Beast Workshop

Taming the Beast Workshop Workshop David Rasmussen & arsten Magnus June 27, 2016 1 / 31 Outline of sequence evolution: rate matrices Markov chain model Variable rates amongst different sites: +Γ Implementation in BES2 2 / 31 genotype

More information

Processes of Evolution

Processes of Evolution 15 Processes of Evolution Forces of Evolution Concept 15.4 Selection Can Be Stabilizing, Directional, or Disruptive Natural selection can act on quantitative traits in three ways: Stabilizing selection

More information

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are:

Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Comparative genomics and proteomics Species available Ensembl focuses on metazoan (animal) genomes. The genomes currently available at the Ensembl site are: Vertebrates: human, chimpanzee, mouse, rat,

More information

Quantifying sequence similarity

Quantifying sequence similarity Quantifying sequence similarity Bas E. Dutilh Systems Biology: Bioinformatic Data Analysis Utrecht University, February 16 th 2016 After this lecture, you can define homology, similarity, and identity

More information

Genomics and bioinformatics summary. Finding genes -- computer searches

Genomics and bioinformatics summary. Finding genes -- computer searches Genomics and bioinformatics summary 1. Gene finding: computer searches, cdnas, ESTs, 2. Microarrays 3. Use BLAST to find homologous sequences 4. Multiple sequence alignments (MSAs) 5. Trees quantify sequence

More information

Session 5: Phylogenomics

Session 5: Phylogenomics Session 5: Phylogenomics B.- Phylogeny based orthology assignment REMINDER: Gene tree reconstruction is divided in three steps: homology search, multiple sequence alignment and model selection plus tree

More information

Evolutionary Models. Evolutionary Models

Evolutionary Models. Evolutionary Models Edit Operators In standard pairwise alignment, what are the allowed edit operators that transform one sequence into the other? Describe how each of these edit operations are represented on a sequence alignment

More information

Probabilistic modeling and molecular phylogeny

Probabilistic modeling and molecular phylogeny Probabilistic modeling and molecular phylogeny Anders Gorm Pedersen Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU) What is a model? Mathematical

More information

Comparative Genomics II

Comparative Genomics II Comparative Genomics II Advances in Bioinformatics and Genomics GEN 240B Jason Stajich May 19 Comparative Genomics II Slide 1/31 Outline Introduction Gene Families Pairwise Methods Phylogenetic Methods

More information

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate.

Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. OEB 242 Exam Practice Problems Answer Key Q1) Explain how background selection and genetic hitchhiking could explain the positive correlation between genetic diversity and recombination rate. First, recall

More information

Introduction to Hidden Markov Models for Gene Prediction ECE-S690

Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Introduction to Hidden Markov Models for Gene Prediction ECE-S690 Outline Markov Models The Hidden Part How can we use this for gene prediction? Learning Models Want to recognize patterns (e.g. sequence

More information

Markov Models & DNA Sequence Evolution

Markov Models & DNA Sequence Evolution 7.91 / 7.36 / BE.490 Lecture #5 Mar. 9, 2004 Markov Models & DNA Sequence Evolution Chris Burge Review of Markov & HMM Models for DNA Markov Models for splice sites Hidden Markov Models - looking under

More information

8/23/2014. Phylogeny and the Tree of Life

8/23/2014. Phylogeny and the Tree of Life Phylogeny and the Tree of Life Chapter 26 Objectives Explain the following characteristics of the Linnaean system of classification: a. binomial nomenclature b. hierarchical classification List the major

More information

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT

Inferring phylogeny. Constructing phylogenetic trees. Tõnu Margus. Bioinformatics MTAT Inferring phylogeny Constructing phylogenetic trees Tõnu Margus Contents What is phylogeny? How/why it is possible to infer it? Representing evolutionary relationships on trees What type questions questions

More information

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre

Chromosomal rearrangements in mammalian genomes : characterising the breakpoints. Claire Lemaitre PhD defense Chromosomal rearrangements in mammalian genomes : characterising the breakpoints Claire Lemaitre Laboratoire de Biométrie et Biologie Évolutive Université Claude Bernard Lyon 1 6 novembre 2008

More information

Comparing Genomes! Homologies and Families! Sequence Alignments!

Comparing Genomes! Homologies and Families! Sequence Alignments! Comparing Genomes! Homologies and Families! Sequence Alignments! Allows us to achieve a greater understanding of vertebrate evolution! Tells us what is common and what is unique between different species

More information

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenetics. BIOL 7711 Computational Bioscience Consortium for Comparative Genomics! University of Colorado School of Medicine Phylogenetics BIOL 7711 Computational Bioscience Biochemistry and Molecular Genetics Computational Bioscience Program Consortium

More information

Comparative Gene Finding. BMI/CS 776 Spring 2015 Colin Dewey

Comparative Gene Finding. BMI/CS 776  Spring 2015 Colin Dewey Comparative Gene Finding BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2015 Colin Dewey cdewey@biostat.wisc.edu Goals for Lecture the key concepts to understand are the following: using related genomes

More information

Stochastic processes and

Stochastic processes and Stochastic processes and Markov chains (part II) Wessel van Wieringen w.n.van.wieringen@vu.nl wieringen@vu nl Department of Epidemiology and Biostatistics, VUmc & Department of Mathematics, VU University

More information

BLAST. Varieties of BLAST

BLAST. Varieties of BLAST BLAST Basic Local Alignment Search Tool (1990) Altschul, Gish, Miller, Myers, & Lipman Uses short-cuts or heuristics to improve search speed Like speed-reading, does not examine every nucleotide of database

More information

Graph Alignment and Biological Networks

Graph Alignment and Biological Networks Graph Alignment and Biological Networks Johannes Berg http://www.uni-koeln.de/ berg Institute for Theoretical Physics University of Cologne Germany p.1/12 Networks in molecular biology New large-scale

More information

Molecular evolution 2. Please sit in row K or forward

Molecular evolution 2. Please sit in row K or forward Molecular evolution 2 Please sit in row K or forward RBFD: cat, mouse, parasite Toxoplamsa gondii cyst in a mouse brain http://phenomena.nationalgeographic.com/2013/04/26/mind-bending-parasite-permanently-quells-cat-fear-in-mice/

More information

Gene function annotation

Gene function annotation Gene function annotation Paul D. Thomas, Ph.D. University of Southern California What is function annotation? The formal answer to the question: what does this gene do? The association between: a description

More information

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS. !! www.clutchprep.com CONCEPT: OVERVIEW OF EVOLUTION Evolution is a process through which variation in individuals makes it more likely for them to survive and reproduce There are principles to the theory

More information

Lecture 17. Comparative genomics I: Genome annotation using evolutionary signatures

Lecture 17. Comparative genomics I: Genome annotation using evolutionary signatures 6.047/6.878/HST.507 Computational Biology: Genomes, Networks, Evolution Lecture 17 Comparative genomics I: Genome annotation using evolutionary signatures 1 Module V: Comparative genomics and evolution

More information

A Practical Algorithm for Ancestral Rearrangement Reconstruction

A Practical Algorithm for Ancestral Rearrangement Reconstruction A Practical Algorithm for Ancestral Rearrangement Reconstruction Jakub Kováč, Broňa Brejová, and Tomáš Vinař 2 Department of Computer Science, Faculty of Mathematics, Physics, and Informatics, Comenius

More information

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building

How Molecules Evolve. Advantages of Molecular Data for Tree Building. Advantages of Molecular Data for Tree Building How Molecules Evolve Guest Lecture: Principles and Methods of Systematic Biology 11 November 2013 Chris Simon Approaching phylogenetics from the point of view of the data Understanding how sequences evolve

More information

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species

Early History up to Schedule. Proteins DNA & RNA Schwann and Schleiden Cell Theory Charles Darwin publishes Origin of Species Schedule Bioinformatics and Computational Biology: History and Biological Background (JH) 0.0 he Parsimony criterion GKN.0 Stochastic Models of Sequence Evolution GKN 7.0 he Likelihood criterion GKN 0.0

More information

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison

10-810: Advanced Algorithms and Models for Computational Biology. microrna and Whole Genome Comparison 10-810: Advanced Algorithms and Models for Computational Biology microrna and Whole Genome Comparison Central Dogma: 90s Transcription factors DNA transcription mrna translation Proteins Central Dogma:

More information

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment

Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Module: Sequence Alignment Theory and Applications Session: Introduction to Searching and Sequence Alignment Introduction to Bioinformatics online course : IBT Jonathan Kayondo Learning Objectives Understand

More information

Biol478/ August

Biol478/ August Biol478/595 29 August # Day Inst. Topic Hwk Reading August 1 M 25 MG Introduction 2 W 27 MG Sequences and Evolution Handouts 3 F 29 MG Sequences and Evolution September M 1 Labor Day 4 W 3 MG Database

More information

Cladistics and Bioinformatics Questions 2013

Cladistics and Bioinformatics Questions 2013 AP Biology Name Cladistics and Bioinformatics Questions 2013 1. The following table shows the percentage similarity in sequences of nucleotides from a homologous gene derived from five different species

More information

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree) I9 Introduction to Bioinformatics, 0 Phylogenetic ree Reconstruction Yuzhen Ye (yye@indiana.edu) School of Informatics & omputing, IUB Evolution theory Speciation Evolution of new organisms is driven by

More information

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot

More information

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics)

UoN, CAS, DBSC BIOL102 lecture notes by: Dr. Mustafa A. Mansi. The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogeny? - Systematics? The Phylogenetic Systematics (Phylogeny and Systematics) - Phylogenetic systematics? Connection between phylogeny and classification. - Phylogenetic systematics informs the

More information

Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A

Substitution = Mutation followed. by Fixation. Common Ancestor ACGATC 1:A G 2:C A GAGATC 3:G A 6:C T 5:T C 4:A C GAAATT 1:G A GAGATC 3:G A 6:C T Common Ancestor ACGATC 1:A G 2:C A Substitution = Mutation followed 5:T C by Fixation GAAATT 4:A C 1:G A AAAATT GAAATT GAGCTC ACGACC Chimp Human Gorilla Gibbon AAAATT GAAATT GAGCTC ACGACC

More information

Inferring Molecular Phylogeny

Inferring Molecular Phylogeny Dr. Walter Salzburger he tree of life, ustav Klimt (1907) Inferring Molecular Phylogeny Inferring Molecular Phylogeny 55 Maximum Parsimony (MP): objections long branches I!! B D long branch attraction

More information

Lecture 4. Models of DNA and protein change. Likelihood methods

Lecture 4. Models of DNA and protein change. Likelihood methods Lecture 4. Models of DNA and protein change. Likelihood methods Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 4. Models of DNA and protein change. Likelihood methods p.1/36

More information

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis

Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis Elements of Bioinformatics 14F01 TP5 -Phylogenetic analysis 10 December 2012 - Corrections - Exercise 1 Non-vertebrate chordates generally possess 2 homologs, vertebrates 3 or more gene copies; a Drosophila

More information

Tree of Life iological Sequence nalysis Chapter http://tolweb.org/tree/ Phylogenetic Prediction ll organisms on Earth have a common ancestor. ll species are related. The relationship is called a phylogeny

More information

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22

Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) p.1/22 Lecture 24. Phylogeny methods, part 4 (Models of DNA and protein change) Joe Felsenstein Department of Genome Sciences and Department of Biology Lecture 24. Phylogeny methods, part 4 (Models of DNA and

More information

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona

Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona Orthology Part I: concepts and implications Toni Gabaldón Centre for Genomic Regulation (CRG), Barcelona (tgabaldon@crg.es) http://gabaldonlab.crg.es Homology the same organ in different animals under

More information

Reading for Lecture 13 Release v10

Reading for Lecture 13 Release v10 Reading for Lecture 13 Release v10 Christopher Lee November 15, 2011 Contents 1 Evolutionary Trees i 1.1 Evolution as a Markov Process...................................... ii 1.2 Rooted vs. Unrooted Trees........................................

More information

The African coelacanth genome provides insights into tetrapod evolution

The African coelacanth genome provides insights into tetrapod evolution The African coelacanth genome provides insights into tetrapod evolution bioinformaatika ajakirjaklubi 27.05.2013 Ülesehitus Täisgenoomi sekveneerimisest vankrid mille ette neid andmeid on rakendatud evolutsiooni

More information

Genomes and Their Evolution

Genomes and Their Evolution Chapter 21 Genomes and Their Evolution PowerPoint Lecture Presentations for Biology Eighth Edition Neil Campbell and Jane Reece Lectures by Chris Romero, updated by Erin Barley with contributions from

More information

The Causes and Consequences of Variation in. Evolutionary Processes Acting on DNA Sequences

The Causes and Consequences of Variation in. Evolutionary Processes Acting on DNA Sequences The Causes and Consequences of Variation in Evolutionary Processes Acting on DNA Sequences This dissertation is submitted for the degree of Doctor of Philosophy at the University of Cambridge Lee Nathan

More information

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family.

Research Proposal. Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Research Proposal Title: Multiple Sequence Alignment used to investigate the co-evolving positions in OxyR Protein family. Name: Minjal Pancholi Howard University Washington, DC. June 19, 2009 Research

More information

Molecular Evolution and Phylogenetic Tree Reconstruction

Molecular Evolution and Phylogenetic Tree Reconstruction 1 4 Molecular Evolution and Phylogenetic Tree Reconstruction 3 2 5 1 4 2 3 5 Orthology, Paralogy, Inparalogs, Outparalogs Phylogenetic Trees Nodes: species Edges: time of independent evolution Edge length

More information

Lecture 3: Markov chains.

Lecture 3: Markov chains. 1 BIOINFORMATIK II PROBABILITY & STATISTICS Summer semester 2008 The University of Zürich and ETH Zürich Lecture 3: Markov chains. Prof. Andrew Barbour Dr. Nicolas Pétrélis Adapted from a course by Dr.

More information

C3020 Molecular Evolution. Exercises #3: Phylogenetics

C3020 Molecular Evolution. Exercises #3: Phylogenetics C3020 Molecular Evolution Exercises #3: Phylogenetics Consider the following sequences for five taxa 1-5 and the known outgroup O, which has the ancestral states (note that sequence 3 has changed from

More information

Bayesian Models for Phylogenetic Trees

Bayesian Models for Phylogenetic Trees Bayesian Models for Phylogenetic Trees Clarence Leung* 1 1 McGill Centre for Bioinformatics, McGill University, Montreal, Quebec, Canada ABSTRACT Introduction: Inferring genetic ancestry of different species

More information

Phylogenetic trees 07/10/13

Phylogenetic trees 07/10/13 Phylogenetic trees 07/10/13 A tree is the only figure to occur in On the Origin of Species by Charles Darwin. It is a graphical representation of the evolutionary relationships among entities that share

More information

Computational Biology: Basics & Interesting Problems

Computational Biology: Basics & Interesting Problems Computational Biology: Basics & Interesting Problems Summary Sources of information Biological concepts: structure & terminology Sequencing Gene finding Protein structure prediction Sources of information

More information

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science

Phylogeny and Evolution. Gina Cannarozzi ETH Zurich Institute of Computational Science Phylogeny and Evolution Gina Cannarozzi ETH Zurich Institute of Computational Science History Aristotle (384-322 BC) classified animals. He found that dolphins do not belong to the fish but to the mammals.

More information

What is Phylogenetics

What is Phylogenetics What is Phylogenetics Phylogenetics is the area of research concerned with finding the genetic connections and relationships between species. The basic idea is to compare specific characters (features)

More information

Computational approaches for functional genomics

Computational approaches for functional genomics Computational approaches for functional genomics Kalin Vetsigian October 31, 2001 The rapidly increasing number of completely sequenced genomes have stimulated the development of new methods for finding

More information

Exploring Evolution & Bioinformatics

Exploring Evolution & Bioinformatics Chapter 6 Exploring Evolution & Bioinformatics Jane Goodall The human sequence (red) differs from the chimpanzee sequence (blue) in only one amino acid in a protein chain of 153 residues for myoglobin

More information

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline Page Evolutionary Trees Russ. ltman MI S 7 Outline. Why build evolutionary trees?. istance-based vs. character-based methods. istance-based: Ultrametric Trees dditive Trees. haracter-based: Perfect phylogeny

More information

Browsing Genomic Information with Ensembl Plants

Browsing Genomic Information with Ensembl Plants Browsing Genomic Information with Ensembl Plants Etienne de Villiers, PhD (Adapted from slides by Bert Overduin EMBL-EBI) Outline of workshop Brief introduction to Ensembl Plants History Content Tutorial

More information

What Is Conservation?

What Is Conservation? What Is Conservation? Lee A. Newberg February 22, 2005 A Central Dogma Junk DNA mutates at a background rate, but functional DNA exhibits conservation. Today s Question What is this conservation? Lee A.

More information

Comparative Network Analysis

Comparative Network Analysis Comparative Network Analysis BMI/CS 776 www.biostat.wisc.edu/bmi776/ Spring 2016 Anthony Gitter gitter@biostat.wisc.edu These slides, excluding third-party material, are licensed under CC BY-NC 4.0 by

More information

Phylogenetics: Building Phylogenetic Trees

Phylogenetics: Building Phylogenetic Trees 1 Phylogenetics: Building Phylogenetic Trees COMP 571 Luay Nakhleh, Rice University 2 Four Questions Need to be Answered What data should we use? Which method should we use? Which evolutionary model should

More information

What can sequences tell us?

What can sequences tell us? Bioinformatics What can sequences tell us? AGACCTGAGATAACCGATAC By themselves? Not a heck of a lot...* *Indeed, one of the key results learned from the Human Genome Project is that disease is much more

More information