2 UPGM UPGM Unweighted Pair-Group Method with rithmetic mean Unweighted = all pairwise distances contribute equally. Pair-Group = groups are combined in pairs. rithmetic mean = pairwise distances to each group (clade) are mean distances to all members of that group. Sokal R &Michener C (1958). statistical method for evaluating systematic relationships. University of Kansas Science ulletin 38:

3 UPGM: Principle UPGM Principle C E C E Find the 2 nodes with the shortest distance (here: C+) Start with unjoined ndoes and a pair-wise distance matrix - C E d, - C d,c d,c - d, d, d C, - E d,e d,e d C,E d,e - Join the 2 nodes Compute the branch lengths (d C,, d C,, d C,E )

4 UPGM: Principle UPGM Principle C E C E Repeat this process iteratively till the whole tree is obtained

5 UPGM: Example C E F G C E F G istance matrix (can be obtained from pair-wise sequence alignments) The following example is from r Richard J. Edwards

6 UPGM: Example C E F G C E F G Find the shortest distance. Here the shortest distance is 1 (between and F) Join the "nodes" (sequences) with the shortest distance: Here we join and F to create node F. epth of the new branch = 1/2 of the shortest distance (so that the node-to-node path length is equal to the shortest distance). Here: d F /2 = F 0.5

7 UPGM: Example F C E F G - F? - C 27? - 8? 26 - E 33? F G 13? Calculate mean pairwise distances with the other nodes (sequences) F C...

8 UPGM: Example F C E F G - F C E F G Calculate mean pairwise distances with the other nodes (sequences) Example d F, = (d, + d F, ) / 2 = ( ) / 2 = 18.5 F C...

9 UPGM: Example F C E G - F C E G Repeat cycle with new shortest distances. Here, the next shortest distance is 8 (between and ). We thus join and with branch length = 8 / 2 = F 0.5

10 UPGM: Example F C E G - F 18 - C E G We join the closest nodes/groups and we recalculate the distances between nodes/groups. Example d F, = (d, + d F, + d, + d F, ) / 4 = = ( ) / 4 = 18 F...

11 UPGM: Example F C E G - F 18 - C E G F G Repeat cycle with new shortest distances. Here, the next shortest distance is 12.5 (between F and G). We thus join F and G with branch length = 12.5 / 2 =

12 UPGM: Example FG C E G - FG C E G The distances between nodes/groups are recalculated.

13 UPGM: Example FG C E - FG C E F G The shortest disance is recalculated, the nodes/groups are joined and the branch length is calculated

14 UPGM: Example FG FG C E FG - FG C E

15 UPGM: Example FG C E FG - C 29 - E F G C

16 UPGM: Example FGC E FGC - E 34 - F G C E

17 UPGM: Example Remark: The source data for this example is a selection of Cytochrome C distances from Table 3 of Fitch & Margoliash (1967) Construction of phylogenetic tree, Science 155: Turtle - Human 19 - C Tuna C Chicken E Moth F Monkey G og Tutle 4 Chick Man 5.75 F Monkey 0.5 G og 6.25 C Tuna E Moth E F G Newick representation: Source: r Richard J. Edwards Slides: Software: 2.5

18 NJ Neighbour Joining (NJ) Neighbours = pair of nodes (sequences, OTUs) who have one node connecting them. Example: C Nodes and are neighbours (connected by only one internal node), and nodes C and are neighbours, whereas nodes and C (for ex.) are not neighbours. Saitou N, Nei M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol iol Evol. 4:

19 NJ: Principle Neighbour Joining (NJ) How to find neighbours? How to construct the tree? Principle: C Find the 2 nodes with the shortest distance (here: C+) Create an internal node (C) C C Compute the branch lengths (d C,C,d,C,d,C,...) E E Start with a "star" tree and a distance matrix dditive principle: d C, = d C,C + d,c

20 NJ: Principle Neighbour Joining (NJ) How to find neighbours? How to construct the tree? Principle: C Repeat this process iteratively till the whole tree is obtained C E E

21 NJ: Principle Neighbour Joining (NJ) How to find neighbours? How to construct the tree? Principle: C Repeat this process iteratively till the whole tree is obtained C E C E - d, - C d,c d,c - d, d, d C, - E d,e d,e d C,E d,e - E The distance between two nodes = distance given in the initial distance matrix

22 NJ: Principle Neighbour Joining (NJ) How to find neighbours? How to construct the tree? Theory: Saitou N, Nei M. (1987) The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol iol Evol. 4: Zvelebil & aum (2008) Terry Speed, lecture notes The Saitou-Nei algorithm is a good approximation of the exact method and run faster. It is illustrated on an example hereafter.

23 NJ: Example C E C E istance matrix The following example is from Prof. Tore Samuelsson (2012) Genomics and ioinformatics - n introduction to Programming Tools for Life Scientists (Chap. 9)

24 NJ: Example C E C E We start by calculating the S x value defined by the sum of all the distances to node X: S x = d X,i Here, we have: N! i=1 S = d, + d,c + d, + d,e = = 64 S = = 60 S C = = 61 S = = 73 S E = = 96

25 NJ: Example - C E 11 - C E We then calculate a δ matrix where δ ij = d ij - (S i + S j ) / (N-2) Here, we have: δ, = d, - (S + S ) / (N-2) = 11 - ( ) / 3 = S,C = 12 - ( ) / 3 = S, = 17 - ( ) / 3 =

26 NJ: Example C E C E δ matrix The number in this matrix reflect the relative total branch length of trees where the nodes i and j have been joined as neighbours.

27 NJ: Example C E C E δ matrix The number in this matrix reflect the relative total branch length of trees where the nodes i and j have been joins as neighbours. s we prefer the tree with the smallest total branch length we identify the minimum value, which in this case is δ,e = Thus and E are the first nodes to be joined, to form a new node E.

28 NJ: Example C E C E δ matrix The distance d,e and d E,E are calulated as d,e = (d,e +(S -S E )/(N-2))/2 = (24+(73-96)/3) /2 = 8.2 d E,E = d,e - d,e = 15.8 These distances are used to build the tree: C 8.2 E 15.8 E

29 NJ: Example C E C E New distance matrix The distances to the new node E are calulated as d,e = (d, + d E, - d,e ) / 2 = ( ) / 2 = 8.5 d,e = (d, + d E, - d,e ) / 2 = ( ) / 2 = 8 d C,E = (d,c + d E,C - d,e ) / 2 = ( ) / 2 = 8

30 NJ: Example C E C E New δ matrix We repeat the operation. Note that here there are two minimum values. We have selected nodes and C (to form node C) but the same final tree is obtained if we choose and E.

31 NJ: Example C E C E New δ matrix The branch lengths are given by: d,c = (d,c + (S -S C ) / (N-2) ) / 2 = (9+(60-61)/2) / 2 = 4.25 d C,C = d,c - d,c = = 4.75 and the tree becomes: 4.25 C E 8.2 C E

32 NJ: Example C E - C 7 - E New distance matrix The distances to the new node E are calulated as d,c = (d, + d C, - d,c ) / 2 = ( ) / 2 = 7 d E,C = (d,e + d C,E - d,c ) / 2 = (8+8-9) / 2 = 3.5

33 NJ: Example C E - C E New δ matrix The branch lengths are given by: d C,C = 1 d,c = 6 and the tree becomes: C C 1 C 6 E E

34 NJ: Example C - C E New distance matrix E Final tree 4.25 C C E 8.2 C E

35 NJ: Example C E Check C E d C, (distance matrix) = 16 d C, (tree) = = C C E 8.2 C E

