DNA Phylogeny Signals and Systems in Biology Kushal Shah @ EE, IIT Delhi
Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny
Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny
Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny
Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny
Phylogenetics Grouping and Division of organisms Keeps changing with time Splitting, hybridization and termination Cladistics : Methods to determine phylogeny
Tree showing evolutionary relationships between various species Taxa joined if they are believed to have a common ancestor
Phylogenetic Tree : Rooted and Unrooted
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Methods of Phylogenetic Analysis Morphological Analysis Average body size Lengths or sizes of specific physical features Certain kinds of behaviour Comparison of RNA sequences : 18S ribosomal RNA Computational Analysis Compute distance matrix Alignment based and Alignment-Free Methods Generate phylogenetic tree based on this matrix Neighbor joining Fitch-Margoliash method Using independent information Maximum Parsimony
Neighbor Joining (NJ)
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Maximum Parsimony Number of evolutionary events necessary to explain the observed differences Every evolutionary event has an associated cost Locate a tree with the minimum cost Not every event is equally likely NP-hard problem!! Heuristic search methods for the cheapest tree
Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) ( Cavalli-Sforza chord measure : D CH = 2 2 π L l ) X u Y u u
Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) Cavalli-Sforza chord measure : ( D CH = 2 2 π L l ) X u Y u u
Genetic Distance X,Y : Two populations for which L loci have been sampled X u,y u : Frequency of uth allele at the lth location Nei s method : D a = l u X u Y u ln ( l u X 2 u )( l u Y 2 u ) ( Cavalli-Sforza chord measure : D CH = 2 2 π L l ) X u Y u u
Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001
Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001
Genetic Distance : Information ( ) K (x) K x y d (x,y) = 1 K (xy) K ( ) : Kolmogorov Complexity d (x,y) satisfies triangle inequality d (x,y) d (y,x) : non-trivial proof M. Li et. al., Bioinformatics 2001
Genetic Distance : Correlation Based I (k) = p (k) p (k) (i,j) (i,j) log 4 p (i)p (j) i,j S S = {A,T,G,C} M. Dehnert et. al., J. Computational Biology 2005