Exercie et 1 Solution 1. What are the main poible advantage of DNA-baed computing? How about diadvantage? Advantage: - More parallel computation - Able to tore tremendou amount of information in very mall pace - Variou field could be combined together to reach a deirable olution - Allow for computing in wet environment - Solution for nano-cale computing and engineering Diadvantage: - Take much time to olve imple problem. - Very difficult to handle error - Sometime there may be an error in the pairing of nucleotide preent in the DNA trand. - Very few mart algorithmic olution exit at the moment, mot baed on exhautive earch (uing the parallelim of biocomputation) More detail: http://www.bvicam.ac.in/new/nrsc%202007/pdf/paper/t_222_02_02_07.pdf 2. What i the DNA equence correponding to the Sanger plot below? The equence i read from 5 to 3. TGCACTTGAACGCATGCT 3. Given two equence, which value i larger: their local imilarity or their global imilarity? Why? How doe their emi-global imilarity compare with the other two value? Global imilarity check what i the imilarity between two given equence and local imilarity check what i the maximum imilarity between two given equence. In emi global imilarity we eek a global alignment where we do not penalize for gap at one or another end of the tring. By definition: Any global alignment i alo a emi global alignment, but there could be better emi-global alignment (each emi-global alignment can be een a a global alignment between a prefix of one tring and a uffix of the other tring). Thu, the emi-global bet core could be larger than the global core. Alo, any emi-global alignment i at the ame time a local alignment, but there could be better local alignment (any local alignment could be een a a emi-global alignment between a prefix of one tring and a uffix of the other tring; note that it can alo be een a a 1
global alignment between two factor of the two tring). Thu, the local alignment core could be larger than the emi-global one. The three core are in general in the following relationhip: Global core emi-global core local core 4. Find all bet global alignment between equence AAAC and AGC, where the coring cheme i +1 for match, -1 for mimatch and -2 for an alignment with a gap. Two equence are given: : AAAC t : AGC For finding bet alignment between and t, firt create a core matrix D filled with maximum alignment core and right mot cell in the lat raw give the bet alignment core. Thi i calculated uing following formula. D(i,j) = Max { D(i, j 1) + g, D(i 1, j) + g, D(i 1, j 1) + f([ i], t[j]) } Here f([ i], t[j]) give the mimatch/match core for character [i] and t[j]. f([ i], t[j]) = 1, if [i] = t[j] D = otherwie. t A G C 0 1 2 3 0 0-2 -4-6 A 1-2 1 A 2-4 A 3-6 C 4-8 D(1,1) = Max { D(1,0) + ( 2), D(0, 1) + ( 2), D(0,0) + f([1], t[1]) } = Max { 2 2, 2 2, 0 + 1 } = 1 (It i calculated from the cell D(0,0). Cell D(1,1) would have diagonal arrow pointing D(0,0).) Arrow in the table indicate from which cell the maximum core i calculated. We continue filling entrie and tracing arrow. 2
D = Bet alignment core i -1. t A G C 0 1 2 3 0 0-2 -4-6 A 1-2 1-1 -3 A 2-4 -1 0-2 A 3-6 -3-2 -1 C 4-8 -5-4 -1 Optimal alignment could be found by walking on the traced arrow path from D(m,n) to D(0,0). 3 poible move: - Diag: the letter from two equence are aligned - Left: gap i introduced into the left equence - Up: a gap i introduced into the top equence Reulting alignment are a follow: Alignment AAAC _ AGC A A A C A G _ C A A A C A _ G C Score (-2)+1+(-1)+1 1+(-1)+(-2)+1 Solution Bet Bet Alignment Alignment Alignment lited above are the bet alignment. 1+ (-2)+(-1)+1 Bet Alignment 5. Find all bet global alignment between equence ATAG and TTCG, where the coring cheme i +1 for match, -1 for mimatch and -1 for an alignment with a gap. 3
t T T C G 0 1 2 3 4 D = 0 0-1 -2-3 -4 A 1-1 -1-2 -3-4 T 2-2 0 0-1 -2 A 3-3 -1-1 -1-2 G 4-4 -2-2 -2 0 Start tracing a path from D(4,4) to D(0,0). Alignment A T A G T T C G Score (-1)+1+(-1)+1 = 0 A _ T A G _ T T C G (-1)+(-1)+1-1+1 6. Find all bet local alignment between equence ATACTGGG and TGACTGAG, uing the ame coring cheme a in exercie 2. Two equence are given : ATACTGGG t: TGACTGAG Method : Smith Waterman method Entrie in the table are calculated uing following formula. L(i, j) = Max{0, L(i 1, j 1) + f([i], t[j]), L(i 1, j) + g, L(i, j 1) + g} t 4
L = T G A C T G A G 0 1 2 3 4 5 6 7 8 0 0 0 0 0 0 0 0 0 0 A 1 0 0 0 1 0 0 0 1 0 T 2 0 1 0 0 1 1 0 0 0 A 3 0 0 0 1 0 0 0 1 0 C 4 0 0 0 0 2 0 0 0 0 T 5 0 1 0 0 0 3 1 0 0 G 6 0 0 2 0 0 1 4 2 1 G 7 0 0 0 0 0 0 2 3 3 G 8 0 0 1 0 0 0 1 4 1 From the above matrix we find the highet core and trace the path until we come to a cell with core zero. Thi cell i not included in the alignment. Alignment ACTGGG ACTGAG ACTG ACTG Score 1+1+1+1+(-1)+1 = 4 1+1+1+1 = 4 Above alignment are the bet local alignment. 7. What coring cheme hould you ue to determine the longet common ubtring and the longet common ubequence for two given tring, uing the algorithm for bet global alignment? For the longet common ubtring and longet common ubequence we do not want to penalize the gap in the beginning and in the end. To exclude mimatche or gap, we penalize dratically, ay by coring them with -N, where N i larger than the length of either tring. For example, N = 1 + (length() + length(t)) (or any other number larger than both m and n). The coring cheme for longet common ubtring for two given tring and t where length of i m and length of t i n. We reward matche with 1 and exclude mimatche and gap by coring them with N. L(0,j) = L(i,0) = 0, where i=0,1, m and j = 0,1,..,n. L(i,j) = L(i-1,j-1) + 1 if ([i] = t[j]) - N, otherwie. The coring cheme for longet common equence would be a follow. We reward matche with 1 and exclude mimatche by coring them with N. For longet common ubequence we allow inertion of gap. 5
L(0,j) = L(i,0) = 0, i=0,1, m, j = 0,1,..,n. L(i,j) = Max (L(i,j-1), L(i-1,j), L(i-1,j-1)+ f([i],t[j])) where f([i],t[j]) = 1, if [i] = t[j] -N, otherwie. - We fill coring matrix according to above coring cheme and with tracing arrow. The arrow point the cell from where the current cell value i coming from. - Start with the lower right corner cell and trace back until it reache to the upper-left corner. - Extract from thi equence of number the longet non-increaing equence. Thi equence, without it lat number, will indicate the longet common ubtring/ubequence - Another way in which thi matrix can be ued i a follow. Following the idea from the local alignment algorithm: tart from the highet value in the matrix and go a much a poible on a path of non-increaing equence of value 8. Apply the coring cheme you indicated in exercie 7 to find all longet common ubtring for tring ATACTGGG and TGACTGGT. Aume we core the inertion of a gap and a mimatch with -9 (any number larger than the length of either tring i ok). The longet common ubtring i ACTGG, (Path colored in red how the non increaing part). A T A C T G G G T G A C T G G T A T A C T G G G 0 1 2 3 4 5 6 7 8 0 0 0 0 0 0 0 0 0 0 T 1 0-9 1-9 -9 1-9 -9-9 G 2 0-9 -9-9 -9-9 2-8 -8 A 3 0 1-9 -8-9 -9-9 -9-9 C 4 0-9 -9-9 -7-9 -9-9 -9 T 5 0-9 -8-9 -9-6 -9-9 -9 G 6 0-9 -9-9 -9-9 -5-8 -8 G 7 0-9 -9-9 -9-9 -8-4 -7 T 8 0-9 -8-9 -9-8 -9-9 -9 6