Computtionl Biology, Phylognti Trs Consnsus mthos Asgr Bruun & Bo Simonsn Th 16th of Jnury 2008 Dprtmnt of Computr Sin Th univrsity of Copnhgn
0 Motivtion Givn olltion of Trs Τ = { T 0,..., T n } W wnt to fin ommon tr tht omins ll trs in T: Diffrnt lgorithms for onstruting phylognti tr givs iffrnt rsults, w wnt to omin thos into on singl tr. W my hv iffrnt iologil t from th h spis (tx), w will rprsnt thm, in phylognti tr, using onsnsus mthos Mthos W'll onsir ths mthos: All trs (input n output) hv th sm st of tx All th input trs hv sust of tx, ut th output tr ontins th st of tx.
A phylognti tr n suivi into lustrs (/monophylti groups/ls) or splits In phylognti tr (root or unroot) vry lf rprsnts txon 1 Mthos s on splits n lustrs
simpl mtho for tr onsnsus is: slt th lustrs or splits ommon to vry input + = + = 1.1 Strit onsnsus
1.2 Mor mthos s on splits n lustrs Mjority rul lik strit, ut only slt lustrs / splits whih is prsnt in 50% of th trs in T Loos onsnsus (k smi strit) slt lustrs / splits whih r omptil with vry tr in T. T T Loos.t. 1 2 Dfinition A olltion of groups C is omptil if thr xists tr T' s.t. vry group in C is lustr / split of T'. T' Gry onsnsus Lik th loos onsnsus tr, ut input splits is sort y frquny. Tk th lmnt with th highst frquny, n uil olltion of omptil lustrs / splits. This givs th gry onsnsus tr.
2 Mthos s on intrstion Ams onsnsus First of ll onsnsus mthos. Will only work on root trs, no nlogu for unroot trs W onstrut th tr rursivly, using this lgorithm: Prour AmsTr(T 1,..., T k ) if T 1 ontins on lf rturn T 1 Construt π(t) : Π(π(T 1 ),..., π(t k )) For h lok B in π(t) AmsTr(T 1 B,..., T k B) Atth th root of ths trs to nw no v rturn this tr T i X mns rstrition. It's fin y: For vry lustr A in T i th output willl A X. Dfinition π 1,..,π n is prtition of th st of tx. Π is th prout of π 1,..,π n Th prout of ths prtitions is th prtition for whih two tx n r in th lok iff thy r in th sm lok for h π 1,..,π n E.g., givs Dfinition Th mximl lustrs of tr T i r th lrgst propr lustrs in T. Th mximl lustr prtition for T i is th prtition π(t i ) of th st of tx with loks qul to th mximl lustrs of T i
2.1 Ams onsnsus xmpl f g f g {} {} {f} {g} π(t i ) = fg π(t 2 ) = f g Π(π(T 1 ), π(t 2 )) = f g AmsTr(T 1 {,,,},T 2 {,,,}) A=T i {,,,} B=T 2 {,,,} {} {} {f} {g} π(a)= π(b)= {} {} {} Π(π(A), π(b)) = {} {} {} {} {}
3 Mthos s on su trs = = root tr, T, ontins root tripl,, if th lst ommon nstor, l(,) sns l(,,). r(t) = th st of root tripls in T. n unroot tr, T, ontins qurtt, if th pth from to in T os not intrst th pth from to. q(t) = th st of qurtts in T. root triplts hs 3 onfigurtions: inry qurtts hs 3 onfigurtions: smpl tr, T 1 = smpl tr, T 2 = r(t 1 ) = {,,, }. q(t 2 ) = {,, }.
3.1 Lol onsnsus tr givn olltion of root trs, T = {T 1,.. T N } input: R = i r(t ) = th st of root tripls ppring in ll trs of T. =1..N i L = st of lfs to omput. pronition: th st R rstrit y L is omptil! rturn: T onsnsus = OnTr(R, L) Psuo o, Prour OnTr(R,S) 1. If n = 1 thn rturn singl vrtx lll y x 1. 2. If n = 2 thn rturn tr with two lvs lll x 1 n x 2. 3. Othrwis, onstrut [R,S] s sri. 4. If [R, S] hs only on omponnt thn rturn `No Tr'. 5. For h omponnt S i of [R, S] o 6. If OnTr(R,S i ) rturns tr thn ll it T i ls rturn `No Tr'. 7. n(for) 8. Construt nw tr T y onnting th roots of th trs T i to nw root r. 9. rturn T. n. proprty of solution: R r(t onsnsus ), ll ommon input tripl r ontin.
3.2 Lol onsnsus tr xmpl Prolm: Inp.T 1 = (((((, ), ), ), ), f) Inp.T 2 = (((((, ), f), ), ), ) T onsnsus : Input lfs, L = {,,,,,f} Input triplts, R = {, f} Rstrit R y L = R n onnt L using R f f T 1 : Input lfs, L = {,} Input triplts, R = {, f} Rstrit R y L = Ø => nothing to onnt. Rturn: lol_root f f Componnts: T 1 :{, }, {}, T 2 :{, }, {f}. Rsult: T 2 :... Rturn: lol_root T 1 T 2 f
4 Clssifition of onsnsus tr mthos grn irl = inlu in this prsnttion Figur: A lssifition of onsnsus tr mthos. [Brynt2003]
Gnrl i: Givn tr A with tx {,,} Givn tr B with tx {,f} 5 Consnsus mthos for trs with iffrnt tx st. W will onstrut tr with tx {,,,,f} From qurtts to phylognti trs Prolms in onstruting phylognti tr: Fw tx muh t Pris tr ut with fw tx (ll Txonomi smpling) Mny tx lss t Oftn is rsult Distn- n hrtr s mthos hs ths prolms W n somthing ttr. Th Four Txon pproh I: tk sust of tx on siz 4, tk ll protin squns known for th sust of tx, onstrut qurtts. Avois oth prolms. Output: Unroot tr (n root using n outgroup)
C j - positiv wight ( rl numr in th intrvl [0;1]) of th j'th qurtt. W ll it th onfin vlu n it's fin s: 5.1 Prolm Dsription Th strngh of th phylognti signl. Siz of th squn popultion. W n sor tr T uilt of th st of qurtts Q sor Q T = C s 1 s S 3 C u u U S Q,U Q Th st S ontins stisfi qurtts (tr topology = qurtt topology) Th st U ontins unrsolv qurtts (str topology) Th sts r trmin y. Lt's onsir qurtt Rmov ll nos ut,,,. Ajnt gs is lt Intrnl nos with gr 2 is lt (onnt jnt nos) Thn w xmin th toplogy. W wnt to mximiz th sor. Th prolm is NP-hr. (Rution from MAX-CUT) Algorithms Th xt lgorithm (mil xponntil running tim) Qurtts puzzling th gry huristi Th gomtri hursti
5.2 Qurtt puzzling Qurtt Puzzling input: st of qurtts, Q, on totl st of tx or spis, S. rpt mny tims: prmut S xut Puzzling Stp (Q, S) tr to olltion of rsults rturn: th mjority onsnsus tr 0++ 0++ 0 0++ 0++ Puzzling Stp (Q, S = {,,,,, f, g,.. }) slt th qurtt topology of {,,, } s nhor for (s =, f, g,..) : rst g ountrs.g. for vry qurtt, q = i j k s, in Q, whr (i, j, & k) < s : lot th no, O, tht intrsts th pths twn i, j & k inrmnt vry ountr in vry su tr from O in onflit with q rnh from g with minimum ount n nw lf, s rturn rsulting tr
Th prolm n rss gomtrilly, n solv y using SDP (Smi finit progrmming). SDP givs n pproximtion, with ny sirl prision. I: Us unit sphr in R n whr n is th numr of tx. For vry qurtt, pl n los to h othr ut n fr from h othr.,,, is pl on th ounry of th unit sphr. W n now formult th smi finit progrmming prolm: Mximiz C j j, j j, j 0.5 C j j, j j, j j, j j, j 1 jk 1 jk Sujt to v i,v i =1 1in 5.3 Th Gomtri Huristi Mx = 4C: n, is pl t th sm point n n is pl t th ntipon point. (Th imtrilly opposit point). Min = -4C: n is pl t th sm point, n t th ntipon point.
5.4 Th Gomtri Huristi Improvmnts: Thrshol for onfin lvl qurtts with low onfin lvl my uilt from inonsistnt t To voi points to pl t th sm point, w wnt smll istn twn th points. W n this onstrint to th SDP prolm: v i,v j 1 1i jn ε shoul rltivly smll (.g. 0.25), W'll otin ttr sor.. Gomtril Clustring Aftr w hv solv th SDP prolm, w'v h qurtts pl (s points) on th unit sphr. W wnt to join thm to gt tr. This in on in this wy: Initiliztion: n lustrs, h ontins singl point At h stp, w rs th numr of lustrs. Th prour trmints whn lustrs = 1 At h stp, w rmov 2 lustrs n 1 Sltion is on y lulting pirwis ulin istn Th point ssoit with th nw lustr is th ntr of mss of th points of th rmov lustrs (i.. txs )
Rfrns [Brynt2003] Dvi Brynt, "A Clssifition of Consnsus Mthos for Phylogntis", Bioonsnsus, Pro. of Tutoril n Workshop on Bioonsnsus, II DIMACS-AMS, (2003) 55-66. [Chor1998] Bnny Chor, "From Qurtts to Phylognti Trs", B. Rovn (E.): SOFSEM 98: Thory n Prti of Informtis, LNCS 1521, pp. 36-53, 1998.