DNA Phylogeny. Signals and Systems in Biology Kushal EE, IIT Delhi

Similar documents
Phylogenetic trees 07/10/13

Bioinformatics 1. Sepp Hochreiter. Biology, Sequences, Phylogenetics Part 4. Bioinformatics 1: Biology, Sequences, Phylogenetics

Algorithmic Methods Well-defined methodology Tree reconstruction those that are well-defined enough to be carried out by a computer. Felsenstein 2004,

Phylogenetic analyses. Kirsi Kostamo

POPULATION GENETICS Winter 2005 Lecture 17 Molecular phylogenetics

Consistency Index (CI)

Phylogenetics: Building Phylogenetic Trees

Phylogenetic Tree Reconstruction

Phylogenetics: Building Phylogenetic Trees. COMP Fall 2010 Luay Nakhleh, Rice University

Theory of Evolution Charles Darwin

EVOLUTIONARY DISTANCES

Theory of Evolution. Charles Darwin

Phylogenetic Analysis and Intraspeci c Variation : Performance of Parsimony, Likelihood, and Distance Methods

Algorithms in Bioinformatics

GENETICS - CLUTCH CH.22 EVOLUTIONARY GENETICS.

"Nothing in biology makes sense except in the light of evolution Theodosius Dobzhansky

Phylogenetic Trees. What They Are Why We Do It & How To Do It. Presented by Amy Harris Dr Brad Morantz

Bioinformatics 1 -- lecture 9. Phylogenetic trees Distance-based tree building Parsimony

What is Phylogenetics

Phylogenetics: Distance Methods. COMP Spring 2015 Luay Nakhleh, Rice University

9/30/11. Evolution theory. Phylogenetic Tree Reconstruction. Phylogenetic trees (binary trees) Phylogeny (phylogenetic tree)

Phylogenetic Analysis. Han Liang, Ph.D. Assistant Professor of Bioinformatics and Computational Biology UT MD Anderson Cancer Center

Phylogenetic Trees. Phylogenetic Trees Five. Phylogeny: Inference Tool. Phylogeny Terminology. Picture of Last Quagga. Importance of Phylogeny 5.


Amira A. AL-Hosary PhD of infectious diseases Department of Animal Medicine (Infectious Diseases) Faculty of Veterinary Medicine Assiut

CS5238 Combinatorial methods in bioinformatics 2003/2004 Semester 1. Lecture 8: Phylogenetic Tree Reconstruction: Distance Based - October 10, 2003

Constructing Evolutionary/Phylogenetic Trees

Dr. Amira A. AL-Hosary

Evolutionary Trees. Evolutionary tree. To describe the evolutionary relationship among species A 3 A 2 A 4. R.C.T. Lee and Chin Lung Lu

C3020 Molecular Evolution. Exercises #3: Phylogenetics

Phylogenetic inference: from sequences to trees

Introduction to characters and parsimony analysis

Estimating Phylogenies (Evolutionary Trees) II. Biol4230 Thurs, March 2, 2017 Bill Pearson Jordan 6-057

Multiple Sequence Alignment. Sequences

Phylogeny: building the tree of life

Lecture 6 Phylogenetic Inference

Phylogenetics. BIOL 7711 Computational Bioscience

Phylogenies Scores for Exhaustive Maximum Likelihood and Parsimony Scores Searches

Constructing Evolutionary/Phylogenetic Trees

Effects of Gap Open and Gap Extension Penalties

Phylogeny Tree Algorithms

NJMerge: A generic technique for scaling phylogeny estimation methods and its application to species trees

Outline. Classification of Living Things

CREATING PHYLOGENETIC TREES FROM DNA SEQUENCES

BINF6201/8201. Molecular phylogenetic methods

MOLECULAR PHYLOGENY AND GENETIC DIVERSITY ANALYSIS. Masatoshi Nei"

Phylogeny: traditional and Bayesian approaches

AN ALTERNATING LEAST SQUARES APPROACH TO INFERRING PHYLOGENIES FROM PAIRWISE DISTANCES

Biological Networks: Comparison, Conservation, and Evolution via Relative Description Length By: Tamir Tuller & Benny Chor

Phylogenetic inference

Chapter 19: Taxonomy, Systematics, and Phylogeny

Plan: Evolutionary trees, characters. Perfect phylogeny Methods: NJ, parsimony, max likelihood, Quartet method

The practice of naming and classifying organisms is called taxonomy.

Building Phylogenetic Trees UPGMA & NJ

Taxonomy. Content. How to determine & classify a species. Phylogeny and evolution

Phylogeny and systematics. Why are these disciplines important in evolutionary biology and how are they related to each other?

Seuqence Analysis '17--lecture 10. Trees types of trees Newick notation UPGMA Fitch Margoliash Distance vs Parsimony

Lecture 1 Hardy-Weinberg equilibrium and key forces affecting gene frequency

Introduction to Bioinformatics Introduction to Bioinformatics

Week 5: Distance methods, DNA and protein models

ON THE UNIQUENESS OF BALANCED MINIMUM EVOLUTION

Phylogenetics: Parsimony

Chapter 26 Phylogeny and the Tree of Life

(Stevens 1991) 1. morphological characters should be assumed to be quantitative unless demonstrated otherwise

Inferring phylogeny. Today s topics. Milestones of molecular evolution studies Contributions to molecular evolution

Historical Biogeography. Historical Biogeography. Systematics

A fuzzy weighted least squares approach to construct phylogenetic network among subfamilies of grass species

MULTIPLE SEQUENCE ALIGNMENT FOR CONSTRUCTION OF PHYLOGENETIC TREE

Page 1. Evolutionary Trees. Why build evolutionary tree? Outline

CSCI1950 Z Computa4onal Methods for Biology Lecture 5

SPECIATION. REPRODUCTIVE BARRIERS PREZYGOTIC: Barriers that prevent fertilization. Habitat isolation Populations can t get together

InDel 3-5. InDel 8-9. InDel 3-5. InDel 8-9. InDel InDel 8-9

Processes of Evolution

Lecture 11 Friday, October 21, 2011

08/21/2017 BLAST. Multiple Sequence Alignments: Clustal Omega

Molecular phylogeny - Using molecular sequences to infer evolutionary relationships. Tore Samuelsson Feb 2016

Molecular evidence for multiple origins of Insectivora and for a new order of endemic African insectivore mammals

A Fitness Distance Correlation Measure for Evolutionary Trees

Bioinformatics tools for phylogeny and visualization. Yanbin Yin

Molecular Phylogenetics (part 1 of 2) Computational Biology Course João André Carriço

CS 394C Algorithms for Computational Biology. Tandy Warnow Spring 2012

How to read and make phylogenetic trees Zuzana Starostová

Chapter 19 Organizing Information About Species: Taxonomy and Cladistics

Comparative Bioinformatics Midterm II Fall 2004

Evolutionary Tree Analysis. Overview

Phylogenetic Networks, Trees, and Clusters

Evolutionary trees. Describe the relationship between objects, e.g. species or genes

A (short) introduction to phylogenetics

Additive distances. w(e), where P ij is the path in T from i to j. Then the matrix [D ij ] is said to be additive.

Zhongyi Xiao. Correlation. In probability theory and statistics, correlation indicates the

Letter to the Editor. Department of Biology, Arizona State University

Chapter 26: Phylogeny and the Tree of Life Phylogenies Show Evolutionary Relationships

Tree Building Activity

THEORY. Based on sequence Length According to the length of sequence being compared it is of following two types

Phylogenies & Classifying species (AKA Cladistics & Taxonomy) What are phylogenies & cladograms? How do we read them? How do we estimate them?

SCIENTIFIC EVIDENCE TO SUPPORT THE THEORY OF EVOLUTION. Using Anatomy, Embryology, Biochemistry, and Paleontology

Estimating Evolutionary Trees. Phylogenetic Methods

Module 13: Molecular Phylogenetics

The Phylogenetic Handbook

Phylogenetics. Applications of phylogenetics. Unrooted networks vs. rooted trees. Outline

Transcription:

DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi

Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny

Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny

Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny

Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny

Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny

Tree showing evolutionary relationships between various species Taxa joined if they are believed to have a common ancestor

Phylogenetic Tree : Rooted and Unrooted

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony

Neighbor Joining (NJ)

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree

Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) ( Cavalli-Sforza chord measure : D CH = 2 2 π L l ) X u Y u u

Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) Cavalli-Sforza chord measure : ( D CH = 2 2 π L l ) X u Y u u

Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) ( Cavalli-Sforza chord measure : D CH = 2 2 π L l ) X u Y u u

Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001

Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001

Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001

Genetic Distance : Correlation Based I (k) = p (k) p (k) (i,j) (i,j) log 4 p (i)p (j) i,j S S = {A,T,G,C} M. Dehnert et. al., J. Computational Biology 2005