Computational Biology, Phylogenetic Trees. Consensus methods

Similar documents
Paths. Connectivity. Euler and Hamilton Paths. Planar graphs.

Why the Junction Tree Algorithm? The Junction Tree Algorithm. Clique Potential Representation. Overview. Chris Williams 1.

V={A,B,C,D,E} E={ (A,D),(A,E),(B,D), (B,E),(C,D),(C,E)}

CSC Design and Analysis of Algorithms. Example: Change-Making Problem

An undirected graph G = (V, E) V a set of vertices E a set of unordered edges (v,w) where v, w in V

V={A,B,C,D,E} E={ (A,D),(A,E),(B,D), (B,E),(C,D),(C,E)}

Constructive Geometric Constraint Solving

b. How many ternary words of length 23 with eight 0 s, nine 1 s and six 2 s?

CSE 373: More on graphs; DFS and BFS. Michael Lee Wednesday, Feb 14, 2018

Outline. Computer Science 331. Computation of Min-Cost Spanning Trees. Costs of Spanning Trees in Weighted Graphs

Math 61 : Discrete Structures Final Exam Instructor: Ciprian Manolescu. You have 180 minutes.

Outline. 1 Introduction. 2 Min-Cost Spanning Trees. 4 Example

Graph Isomorphism. Graphs - II. Cayley s Formula. Planar Graphs. Outline. Is K 5 planar? The number of labeled trees on n nodes is n n-2

Present state Next state Q + M N

12. Traffic engineering

Similarity Search. The Binary Branch Distance. Nikolaus Augsten.

12/3/12. Outline. Part 10. Graphs. Circuits. Euler paths/circuits. Euler s bridge problem (Bridges of Konigsberg Problem)

5/9/13. Part 10. Graphs. Outline. Circuits. Introduction Terminology Implementing Graphs

Cycles and Simple Cycles. Paths and Simple Paths. Trees. Problem: There is No Completely Standard Terminology!

CS200: Graphs. Graphs. Directed Graphs. Graphs/Networks Around Us. What can this represent? Sometimes we want to represent directionality:

Algorithmic and NP-Completeness Aspects of a Total Lict Domination Number of a Graph

Module graph.py. 1 Introduction. 2 Graph basics. 3 Module graph.py. 3.1 Objects. CS 231 Naomi Nishimura

CSE 373. Graphs 1: Concepts, Depth/Breadth-First Search reading: Weiss Ch. 9. slides created by Marty Stepp

QUESTIONS BEGIN HERE!

COMP108 Algorithmic Foundations

Multipoint Alternate Marking method for passive and hybrid performance monitoring

, each of which is a tree, and whose roots r 1. , respectively, are children of r. Data Structures & File Management

CS September 2018

QUESTIONS BEGIN HERE!

Quartets and unrooted level-k networks

(2) If we multiplied a row of B by λ, then the value is also multiplied by λ(here lambda could be 0). namely

Garnir Polynomial and their Properties

Outline. Circuits. Euler paths/circuits 4/25/12. Part 10. Graphs. Euler s bridge problem (Bridges of Konigsberg Problem)

CS553 Lecture Register Allocation I 3

COMPLEXITY OF COUNTING PLANAR TILINGS BY TWO BARS

CS 461, Lecture 17. Today s Outline. Example Run

Exam 1 Solution. CS 542 Advanced Data Structures and Algorithms 2/14/2013

# 1 ' 10 ' 100. Decimal point = 4 hundred. = 6 tens (or sixty) = 5 ones (or five) = 2 tenths. = 7 hundredths.

ECE COMBINATIONAL BUILDING BLOCKS - INVEST 13 DECODERS AND ENCODERS

Problem solving by search

Graphs. CSC 1300 Discrete Structures Villanova University. Villanova CSC Dr Papalaskari

EE1000 Project 4 Digital Volt Meter

Planar Upward Drawings

More Foundations. Undirected Graphs. Degree. A Theorem. Graphs, Products, & Relations

Solutions for HW11. Exercise 34. (a) Use the recurrence relation t(g) = t(g e) + t(g/e) to count the number of spanning trees of v 1

0.1. Exercise 1: the distances between four points in a graph

Section 10.4 Connectivity (up to paths and isomorphism, not including)

CS 241 Analysis of Algorithms

CS61B Lecture #33. Administrivia: Autograder will run this evening. Today s Readings: Graph Structures: DSIJ, Chapter 12

Graphs. Graphs. Graphs: Basic Terminology. Directed Graphs. Dr Papalaskari 1

A Simple Code Generator. Code generation Algorithm. Register and Address Descriptors. Example 3/31/2008. Code Generation

arxiv: v1 [cs.ds] 20 Feb 2008

a b v a v b v c v = a d + bd +c d +ae r = p + a 0 s = r + b 0 4 ac + ad + bc + bd + e 5 = a + b = q 0 c + qc 0 + qc (a) s v (b)

CSE303 - Introduction to the Theory of Computing Sample Solutions for Exercises on Finite Automata

MAT3707. Tutorial letter 201/1/2017 DISCRETE MATHEMATICS: COMBINATORICS. Semester 1. Department of Mathematical Sciences MAT3707/201/1/2017

Minimum Spanning Trees

Section 3: Antiderivatives of Formulas

CSE 373: AVL trees. Warmup: Warmup. Interlude: Exploring the balance invariant. AVL Trees: Invariants. AVL tree invariants review

1 Introduction to Modulo 7 Arithmetic

DUET WITH DIAMONDS COLOR SHIFTING BRACELET By Leslie Rogalski

Weighted graphs -- reminder. Data Structures LECTURE 15. Shortest paths algorithms. Example: weighted graph. Two basic properties of shortest paths

Outline. Binary Tree

Register Allocation. Register Allocation. Principle Phases. Principle Phases. Example: Build. Spills 11/14/2012

GREEDY TECHNIQUE. Greedy method vs. Dynamic programming method:

WORKSHOP 6 BRIDGE TRUSS

CSI35 Chapter 11 Review

MULTIPLE-LEVEL LOGIC OPTIMIZATION II

LEO VAN IERSEL TU DELFT

The University of Sydney MATH2969/2069. Graph Theory Tutorial 5 (Week 12) Solutions 2008

Decimals DECIMALS.

NP-Completeness. CS3230 (Algorithm) Traveling Salesperson Problem. What s the Big Deal? Given a Problem. What s the Big Deal? What s the Big Deal?

Seven-Segment Display Driver

Construction 11: Book I, Proposition 42

Compression. Compression. Compression. This part of the course... Ifi, UiO Norsk Regnesentral Vårsemester 2005 Wolfgang Leister

1. Determine whether or not the following binary relations are equivalence relations. Be sure to justify your answers.

The Plan. Honey, I Shrunk the Data. Why Compress. Data Compression Concepts. Braille Example. Braille. x y xˆ

(a) v 1. v a. v i. v s. (b)

Announcements. Not graphs. These are Graphs. Applications of Graphs. Graph Definitions. Graphs & Graph Algorithms. A6 released today: Risk

RAM Model. I/O Model. Real Machine Example: Nehalem : Algorithms in the Real World 4/9/13

Weighted Graphs. Weighted graphs may be either directed or undirected.

XML and Databases. Outline. Recall: Top-Down Evaluation of Simple Paths. Recall: Top-Down Evaluation of Simple Paths. Sebastian Maneth NICTA and UNSW

Numbering Boundary Nodes

Graph Contraction and Connectivity

learning objectives learn what graphs are in mathematical terms learn how to represent graphs in computers learn about typical graph algorithms

Case Study VI Answers PHA 5127 Fall 2006

Instructions for Section 1

Page 1. Question 19.1b Electric Charge II Question 19.2a Conductors I. ConcepTest Clicker Questions Chapter 19. Physics, 4 th Edition James S.

Designing A Concrete Arch Bridge

5/7/13. Part 10. Graphs. Theorem Theorem Graphs Describing Precedence. Outline. Theorem 10-1: The Handshaking Theorem

Polygons POLYGONS.

Fundamental Algorithms for System Modeling, Analysis, and Optimization

Solutions to Homework 5

INTEGRALS. Chapter 7. d dx. 7.1 Overview Let d dx F (x) = f (x). Then, we write f ( x)

Kernels. ffl A kernel K is a function of two objects, for example, two sentence/tree pairs (x1; y1) and (x2; y2)

Walk Like a Mathematician Learning Task:

Integration Continued. Integration by Parts Solving Definite Integrals: Area Under a Curve Improper Integrals

ECE 407 Computer Aided Design for Electronic Systems. Circuit Modeling and Basic Graph Concepts/Algorithms. Instructor: Maria K. Michael.

Aquauno Video 6 Plus Page 1

THE evolutionary history of a set of species is usually

Lecture 7 Phylogenetic Analysis

Transcription:

Computtionl Biology, Phylognti Trs Consnsus mthos Asgr Bruun & Bo Simonsn Th 16th of Jnury 2008 Dprtmnt of Computr Sin Th univrsity of Copnhgn

0 Motivtion Givn olltion of Trs Τ = { T 0,..., T n } W wnt to fin ommon tr tht omins ll trs in T: Diffrnt lgorithms for onstruting phylognti tr givs iffrnt rsults, w wnt to omin thos into on singl tr. W my hv iffrnt iologil t from th h spis (tx), w will rprsnt thm, in phylognti tr, using onsnsus mthos Mthos W'll onsir ths mthos: All trs (input n output) hv th sm st of tx All th input trs hv sust of tx, ut th output tr ontins th st of tx.

A phylognti tr n suivi into lustrs (/monophylti groups/ls) or splits In phylognti tr (root or unroot) vry lf rprsnts txon 1 Mthos s on splits n lustrs

simpl mtho for tr onsnsus is: slt th lustrs or splits ommon to vry input + = + = 1.1 Strit onsnsus

1.2 Mor mthos s on splits n lustrs Mjority rul lik strit, ut only slt lustrs / splits whih is prsnt in 50% of th trs in T Loos onsnsus (k smi strit) slt lustrs / splits whih r omptil with vry tr in T. T T Loos.t. 1 2 Dfinition A olltion of groups C is omptil if thr xists tr T' s.t. vry group in C is lustr / split of T'. T' Gry onsnsus Lik th loos onsnsus tr, ut input splits is sort y frquny. Tk th lmnt with th highst frquny, n uil olltion of omptil lustrs / splits. This givs th gry onsnsus tr.

2 Mthos s on intrstion Ams onsnsus First of ll onsnsus mthos. Will only work on root trs, no nlogu for unroot trs W onstrut th tr rursivly, using this lgorithm: Prour AmsTr(T 1,..., T k ) if T 1 ontins on lf rturn T 1 Construt π(t) : Π(π(T 1 ),..., π(t k )) For h lok B in π(t) AmsTr(T 1 B,..., T k B) Atth th root of ths trs to nw no v rturn this tr T i X mns rstrition. It's fin y: For vry lustr A in T i th output willl A X. Dfinition π 1,..,π n is prtition of th st of tx. Π is th prout of π 1,..,π n Th prout of ths prtitions is th prtition for whih two tx n r in th lok iff thy r in th sm lok for h π 1,..,π n E.g., givs Dfinition Th mximl lustrs of tr T i r th lrgst propr lustrs in T. Th mximl lustr prtition for T i is th prtition π(t i ) of th st of tx with loks qul to th mximl lustrs of T i

2.1 Ams onsnsus xmpl f g f g {} {} {f} {g} π(t i ) = fg π(t 2 ) = f g Π(π(T 1 ), π(t 2 )) = f g AmsTr(T 1 {,,,},T 2 {,,,}) A=T i {,,,} B=T 2 {,,,} {} {} {f} {g} π(a)= π(b)= {} {} {} Π(π(A), π(b)) = {} {} {} {} {}

3 Mthos s on su trs = = root tr, T, ontins root tripl,, if th lst ommon nstor, l(,) sns l(,,). r(t) = th st of root tripls in T. n unroot tr, T, ontins qurtt, if th pth from to in T os not intrst th pth from to. q(t) = th st of qurtts in T. root triplts hs 3 onfigurtions: inry qurtts hs 3 onfigurtions: smpl tr, T 1 = smpl tr, T 2 = r(t 1 ) = {,,, }. q(t 2 ) = {,, }.

3.1 Lol onsnsus tr givn olltion of root trs, T = {T 1,.. T N } input: R = i r(t ) = th st of root tripls ppring in ll trs of T. =1..N i L = st of lfs to omput. pronition: th st R rstrit y L is omptil! rturn: T onsnsus = OnTr(R, L) Psuo o, Prour OnTr(R,S) 1. If n = 1 thn rturn singl vrtx lll y x 1. 2. If n = 2 thn rturn tr with two lvs lll x 1 n x 2. 3. Othrwis, onstrut [R,S] s sri. 4. If [R, S] hs only on omponnt thn rturn `No Tr'. 5. For h omponnt S i of [R, S] o 6. If OnTr(R,S i ) rturns tr thn ll it T i ls rturn `No Tr'. 7. n(for) 8. Construt nw tr T y onnting th roots of th trs T i to nw root r. 9. rturn T. n. proprty of solution: R r(t onsnsus ), ll ommon input tripl r ontin.

3.2 Lol onsnsus tr xmpl Prolm: Inp.T 1 = (((((, ), ), ), ), f) Inp.T 2 = (((((, ), f), ), ), ) T onsnsus : Input lfs, L = {,,,,,f} Input triplts, R = {, f} Rstrit R y L = R n onnt L using R f f T 1 : Input lfs, L = {,} Input triplts, R = {, f} Rstrit R y L = Ø => nothing to onnt. Rturn: lol_root f f Componnts: T 1 :{, }, {}, T 2 :{, }, {f}. Rsult: T 2 :... Rturn: lol_root T 1 T 2 f

4 Clssifition of onsnsus tr mthos grn irl = inlu in this prsnttion Figur: A lssifition of onsnsus tr mthos. [Brynt2003]

Gnrl i: Givn tr A with tx {,,} Givn tr B with tx {,f} 5 Consnsus mthos for trs with iffrnt tx st. W will onstrut tr with tx {,,,,f} From qurtts to phylognti trs Prolms in onstruting phylognti tr: Fw tx muh t Pris tr ut with fw tx (ll Txonomi smpling) Mny tx lss t Oftn is rsult Distn- n hrtr s mthos hs ths prolms W n somthing ttr. Th Four Txon pproh I: tk sust of tx on siz 4, tk ll protin squns known for th sust of tx, onstrut qurtts. Avois oth prolms. Output: Unroot tr (n root using n outgroup)

C j - positiv wight ( rl numr in th intrvl [0;1]) of th j'th qurtt. W ll it th onfin vlu n it's fin s: 5.1 Prolm Dsription Th strngh of th phylognti signl. Siz of th squn popultion. W n sor tr T uilt of th st of qurtts Q sor Q T = C s 1 s S 3 C u u U S Q,U Q Th st S ontins stisfi qurtts (tr topology = qurtt topology) Th st U ontins unrsolv qurtts (str topology) Th sts r trmin y. Lt's onsir qurtt Rmov ll nos ut,,,. Ajnt gs is lt Intrnl nos with gr 2 is lt (onnt jnt nos) Thn w xmin th toplogy. W wnt to mximiz th sor. Th prolm is NP-hr. (Rution from MAX-CUT) Algorithms Th xt lgorithm (mil xponntil running tim) Qurtts puzzling th gry huristi Th gomtri hursti

5.2 Qurtt puzzling Qurtt Puzzling input: st of qurtts, Q, on totl st of tx or spis, S. rpt mny tims: prmut S xut Puzzling Stp (Q, S) tr to olltion of rsults rturn: th mjority onsnsus tr 0++ 0++ 0 0++ 0++ Puzzling Stp (Q, S = {,,,,, f, g,.. }) slt th qurtt topology of {,,, } s nhor for (s =, f, g,..) : rst g ountrs.g. for vry qurtt, q = i j k s, in Q, whr (i, j, & k) < s : lot th no, O, tht intrsts th pths twn i, j & k inrmnt vry ountr in vry su tr from O in onflit with q rnh from g with minimum ount n nw lf, s rturn rsulting tr

Th prolm n rss gomtrilly, n solv y using SDP (Smi finit progrmming). SDP givs n pproximtion, with ny sirl prision. I: Us unit sphr in R n whr n is th numr of tx. For vry qurtt, pl n los to h othr ut n fr from h othr.,,, is pl on th ounry of th unit sphr. W n now formult th smi finit progrmming prolm: Mximiz C j j, j j, j 0.5 C j j, j j, j j, j j, j 1 jk 1 jk Sujt to v i,v i =1 1in 5.3 Th Gomtri Huristi Mx = 4C: n, is pl t th sm point n n is pl t th ntipon point. (Th imtrilly opposit point). Min = -4C: n is pl t th sm point, n t th ntipon point.

5.4 Th Gomtri Huristi Improvmnts: Thrshol for onfin lvl qurtts with low onfin lvl my uilt from inonsistnt t To voi points to pl t th sm point, w wnt smll istn twn th points. W n this onstrint to th SDP prolm: v i,v j 1 1i jn ε shoul rltivly smll (.g. 0.25), W'll otin ttr sor.. Gomtril Clustring Aftr w hv solv th SDP prolm, w'v h qurtts pl (s points) on th unit sphr. W wnt to join thm to gt tr. This in on in this wy: Initiliztion: n lustrs, h ontins singl point At h stp, w rs th numr of lustrs. Th prour trmints whn lustrs = 1 At h stp, w rmov 2 lustrs n 1 Sltion is on y lulting pirwis ulin istn Th point ssoit with th nw lustr is th ntr of mss of th points of th rmov lustrs (i.. txs )

Rfrns [Brynt2003] Dvi Brynt, "A Clssifition of Consnsus Mthos for Phylogntis", Bioonsnsus, Pro. of Tutoril n Workshop on Bioonsnsus, II DIMACS-AMS, (2003) 55-66. [Chor1998] Bnny Chor, "From Qurtts to Phylognti Trs", B. Rovn (E.): SOFSEM 98: Thory n Prti of Informtis, LNCS 1521, pp. 36-53, 1998.