Patterns, Profiles, and
|
|
- Bernard McDowell
- 5 years ago
- Views:
Transcription
1 Patterns, Profiles, and Mltiple l Alignments
2 Otline Profiles, Position Specific Scoring Matrices Profile Hidden Markov Models Alignment of Profiles Mltiple Alignment Algorithms Problem definition Can we se Dynamic Programming to solve MSA? Progressive Alignment ClstalW Scoring Mltiple Alignments Entropy Sm of Pairs SP Score Seqence Pattern Discovery
3 Mltiple Seqences Up to this point we have considered pairwise relationships between seqences Protein families contain mltiple l seqences E.g. Globins Conserved, important regions A common task is to find ot whether a new seqence can be inclded in a protein family or not. We can then assign a fnction and predict 3D strctre
4 Profile Representation of Mltiple Seqences - A G G C T A T C A C C T G T A G C T A C C A G C A G C T A C C A G C A G C T A T C A C G G C A G C T A T C G C G G A C G T
5 Profile Representation of Mltiple Seqences - A G G C T A T C A C C T G T A G C T A C C A G C A G C T A C C A G C A G C T A T C A C G G C A G C T A T C G C G G A C G T Earlier, we were aligning a seqence against a seqence Can we align a seqence against a profile in order to find ot whether this seqence belongs to that protein family?
6 PSSM Position Specific Scoring Matrices The profile representation of a protein family can be a considered as the coefficients of a scoring matrix which give amino acid preferences for each alignment position separately How to obtain PSSMs? From a given mltiple alignment of seqences Database searches where a set of database proteins are aligned to the qery i.e. reference seqence
7 How to obtain PSSMs? Sppose we have a mltiple alignment of N seqences. To obtain the vales of the PSSM matrix m,b, we can se the techniqe we have sed to align a seqence to a profile and se the freqencies of amino acids at each colmn as coefficients: n, b = m,, a N f, b = f s, b a, b reside types b
8 A better way to weigh reside preferences It is better to give preferred resides extra weight, becase reside types rarely fond are probably highly disfavored at that colmn Use logarithmic weighting: m, a ln1 f n, b s f =, b a, ln1/ N + 1, b reside N + 1 = b types b
9 Using log-odds odds ratios We can also se a techniqe similar to the constrction of the common scoring matrices m, a = log q, a p a If sfficient data is available: q = f, a, a
10 If sfficient data is not available q,a will case problems Amino acids that do not occr in a colmn will case a score of - A simple soltion: assme at least one occrrence of each reside type q, a = N n, a + 1 N + 20 These additional conts are called psedoconts.
11 A better formla Incorporate backgrond amino acid composition q, a n + β p =, a N + β a β is the total nmber of psedoconts in a colmn. Can be adjsted based on the amont of data available.
12 Viewing Profiles/PSSMs as logos I Reside contribtion at each position: = log 2 20 H f, a I H, a = f, a log 2 f a ed
13 Profile Hidden Markov Models Using HMMs to represent a profile and to align a seqence to a profile
14 Hidden Markov Model Hidden Markov models are probabilistic finite state machines. A hidden Markov model λ is determined by the following parameters: A set of finite nmber of states: S i, 1 i N The probability of starting in state S i, π i The transition probability from state S i to S j, a ij The emission probability density of a symbol ω in state S i
15 What is hidden? With a hidden Markov model, we sally model a temporal process whose otpt we can observe bt do not know the actal nderlying mathematical or physical model. We try to model this process statistically. Here the states of the hidden Markov model are hidden to s. We assme that there is some nderlying model or some logic that prodces a set of otpt signals.
16 Problems associated with HMMs There are three typical qestions one can ask regarding HMMs: Given the parameters of the model, compte the probability of a particlar otpt seqence. This problem is solved by the forward-backward procedre. Given the parameters of the model, find the most likely seqence of hidden states that cold have generated a given otpt seqence. This problem is solved by the Viterbi algorithm. Given an otpt seqence or a set of sch seqences, find the most likely set of state transition and otpt probabilities. In other words, train the parameters of the HMM given a dataset of otpt seqences. This problem is solved by the Bam-Welch algorithm.
17 Example from Wikipedia Yo have a fi friend to whom yo talk lkdaily on the phone. Yor fi friend is only interested in three activities: walking in the park, shopping, and cleaning his apartment. The choice of what to do is determined exclsively by the weather on a given day. Yo have no definite information abot the weather where yor friend lives, bt yo know general trends. Based on what he tells yo he did each day, yo try to gess what the weather mst have been like. Th t t t "R i " d "S " b t t b There are two states, "Rainy" and "Snny", bt yo cannot observe them directly, that is, they are hidden from yo. On each day, there is a certain chance that yor friend will perform one of the following activities, depending on the weather: "walk", "shop", or "clean". Since yor friend tells yo abot his activities, those are the observations.
18 Example from Wikipedia Yo know the general weather trends in the area, and what yor friend likes to do on the average. In other words, the parameters of the HMM are known. walk: 0.1 shop: 0.4 clean: walk: 0.6 shop: 0.3 clean: Rainy Snny start
19 Example from Wikipedia Now, yo talked to yor friend for three days, and he told yo that on the first day walked, the next day he shopped and the last day he cleaned. What is the most likely seqence of days that cold have prodced this otcome? Solved by the Viterbi algorithm What is the overall probability bilit of the observations? Solved by forward-backward algorithm walk: walk: 0.6 shop: 0.4 shop: 0.3 clean: 0.5 clean: start t Rainy Snny
20 HMMs to represent a family of seqences Given a mltiple alignment of seqences, we can se an HMM to model the seqences. Each colmn of the alignment may be represented by a hidden state that prodced that colmn. Insertions and deletions can be represented by other states.
21 Profile HMMs from alignments A given mltiple seqence alignment may be sed to get the following HMM.
22 Profile HMMs The strctre of the profile HMM is given by: From: simple.png There are match, insert, and delete states. Given a mltiple seqence alignment we can easily determine the HMM parameters no Bam-Welch needed in this case
23 Profile HMMS The strctre is of profile HMMs is sally fixed following some biological observations:
24 Variants for non-global alignments Local alignments flanking model Emission prob. in flanking states se backgrond vales p a. a Looping prob. close to 1, e.g. 1- η for some small η. D j I j M j Begin End Q Q
25 Variants for non-global alignments Overlap alignments Only transitions to the first model state are allowed. When expecting to find either present as a whole or absent Transition to first delete state allows missing first reside D j Q I j Q Begin M j End
26 Variants for non-global alignments Repeat alignments Transition from right flanking state back to random model Can find mltiple matching segments in qery string D j I j M j Begin Q End
27 Determining the states of the HMM The strctre is sally fixed and only the nmber of match states is to be determined An alignment colmn with no gaps can be considered as a match state. An alignment colmn with a majority of gaps can be considered an insert state. Usally a threshold on the gap proportion is determined to find the match states
28 Determining the transition probabilities The transition probabilities from a state always add p to 1 except the end state which do not have any transitions from it. t, v = w 1
29 Determining the transition probabilities Cont all the transitions on the given mltiple alignment from an alignment position or state and se them in the following eqation to find transition probabilities from a state t, v = m w, v m,w,
30 Determining the emission probabilities Determining the emission probabilities Emission probabilities in a match or insert state also adds p to 1. = a M M n a e, + = a n M a e 1, b b M M n a e, + = b b M M n a e 20, Using psedoconts a I p a e =
31 Psedoconts in HMMs If we do not observe a certain amino acid at a colmn of an MSA the emission probability is 0 for that amino acid. This strategy will give 0 probability to sch seqences. However, we do not want to do that, i.e., we do not want to miss seqences that may be biologically important becase of only one mismatch Soltion: Add hypothetical occrrences of those amino acids into the colmns. E.g, assme a backgrond distribtion for each colmn. The actal appearances of amino acids at a colmn pdate the backgrond distribtion.
32 Example withot psedoconts
33 Example with psedoconts
34 Scoring a seqence against a profile HMM Given a profile HMM, any given path throgh the model will emit a seqence with an associated probability The path probability is the prodct of all transition and emission probabilities along the path.
35 Scoring a seqence against a profile HMM Given a qery seqence sing algorithms similar to DP for seqence alignment we can compte the most probable path that will emit that qery seqence Viterbi algorithm The probability that the given qery seqence is emitted by that HMM is given by the sm of all probabilities over all possible paths forward/backward algorithms.
36 Viterbi algorithm It is a dynamic programming algorithm. It is based on the following assmptions: At any time the process we are modeling is in some state We have finite nmber of states Each state prodces a single otpt Compting the most likely hidden seqence p to a certain point t mst depend only on the observed event at point t, and the most likely seqence at point t 1. These assmptions are satisfied by a first order HMM.
37 Viterbi DP recrrence relations Viterbi DP recrrence relations M M t x v =,, max i I i M i M i M M I t x v M M t x v x e x v, i D M D t x v =,, max 1 1 i I i M x i I I I t x v I M t x v p x v i i 1 i M D M t x v =,, max i D i M i D D D t x v D M t x v x v
38 Viterbi algorithm Using these eqations we can fill in a partial scores table v of size Initialization: length qery x nmber of states in HMM set v Start = 1, v = 0 for all states H lti l i b biliti ill However, mltiplying many probabilities will lead to very small nmbers for long seqences. Therefore se log of probabilities instead and convert prodcts to smmation
39 Viterbi DP recrrence relations Viterbi DP recrrence relations + log log M M t x v =, log log, log log max log log i I i M i M i M M I t x v M M t x v x e x v +, log log i D M D t x v =, log log, log log max log log 1 1 i I i M x i I I I t x v I M t x v p x v i i + log log 1 i M D M t x v + + =, log log, log log max log i D i M i D D D t x v D M t x v x v
40 Forward algorithm Forward and backward algorithms are sed to compte the probability of the seqence being emitted from an HMM by smming p the probabilities over all possible paths. Modification on Viterbi DP eqations: Instead of sing max, sm all options
41 Forward algorithm Forward algorithm, [ 1 1 i M i M i M M M t x f x e x f + =,, [ i I i M i M i M M I t x f f f + + ], i D M D t x f + ],, [ 1 1 i I i M x i I I I t x f I M t x f p x f i + = ] [ D D t f D M t f f + ],, [ i D i M i D D D t x f D M t x f x f + =
42 Searching a database Given the hidden Markov model for a protein family. We can evalate all the seqences in a database in terms of how likely they cold have been prodced by this HMM model. This likelihood score can be sed to find new protein seqences as candidate members for that protein family. We can se the Viterbi algorithm or the forward/backward algorithms to compte the probability. bilit
43 Finding the HMM for a protein family Given a set of seqences, can we find the parameters of an HMM withot performing a mltiple seqence alignment first? Yes. We can se the Bam-Welch algorithm However, we have to decide first How many match states are there?
44 Bam-Welch Algorithm The Bam-Welch algorithm is an expectationmaximization EM algorithm. It can compte maximm likelihood estimates and posterior mode estimates for the parameters transition and emission probabilities of an HMM, when given only emissions as training data.
45 Available profile HMM tools SAM HMMER2 and many others
46 Mltiple alignment algorithms One of the most essential tools in moleclar biology Finding highly conserved sbregions or embedded patterns of a set of biological seqences Conserved regions sally are key fnctional regions, prime targets for drg developments Estimation of evoltionary distance between seqences Prediction of protein secondary/tertiary ti strctre t Practically sefl methods only since 1987 D. Sankoff Before 1987 they were constrcted by hand Dynamic programming is expensive
47 Mltiple Seqence Alignment MSA What is mltiple seqence alignment? Given k seqences: VTISCTGSSSNIGAGNHVKWYQQLPG VTISCTGTSSNIGSITVNWYQQLPG LRLSCSSSGFIFSSYAMYWVRQAPG LSLTCTVSGTSFDDYYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG ATLVCLISDFYPGAVTVAWKADS AALGCLVKDYFPEPVTVSWNSG VSLTCLVKGFYPSDIAVEWESNG
48 Mltiple Seqence Alignment MSA An MSA of these seqences: VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG--
49 Mltiple Seqence Alignment MSA An MSA of these seqences: VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Conserved resides
50 Mltiple Seqence Alignment MSA An MSA of these seqences: VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Conserved regions
51 Mltiple Seqence Alignment MSA An MSA of these seqences: VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Patterns? Positions 1 and 3 are hydrophobic resides
52 Mltiple Seqence Alignment MSA An MSA of these seqences: VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG-- Conserved resides, regions, patterns
53 MSA Warnings MSA algorithms work nder the assmption that they are aligning related seqences They will align ANYTHING they are given, even if nrelated If it jst looks wrong it probably is
54 Generalizing the Notion of Pairwise Alignment Alignment of 2 seqences is represented as a 2-row matrix In a similar way, we represent alignment of 3 seqences as a 3-row matrix A T _ G C G _ A _ C G T _ A A T C A C _ A Score: more conserved colmns, better alignment
55 Alignments = Paths in k dimensional grids Align 3 seqences: ATGC, AATC,ATGC A -- T G C A A T -- C -- A T G C
56 Alignment Paths A -- T G C x coordinate A A T -- C -- A T G C
57 Alignment Paths A -- T G C A A T -- C x coordinate y coordinate -- A T G C
58 Alignment Paths A -- T G C A A T -- C A T G C x coordinate y coordinate z coordinate Reslting path in x,y,z space: 0,0,0 1,1,0 1,2,1,,,,, 2,3,2,, 3,3,3,, 4,4,4,,
59 Aligning Three Seqences Same strategy as aligning two seqences Use a 3-D Dmatrix, with each axis representing a seqence to align For global alignments, go from sorce to sink sorce sink
60 2-D Dvs3D 3-D Alignment Grid V W 2-D alignment matrix 3-D alignment matrix
61 2-D cell verss 3-D Alignment Cell In 2-D, 3 edges in each nit sqare In 3-D, 7 edges in each nit cbe
62 Architectre of 3-D Alignment Cell i-1,j-1,k-1 i-1,j,k-1 i-1,j-1,k i-1,j,k i,j-1,k-1 i,j,k-1 i,j-1,k,j, i,j,k
63 Mltiple Alignment: Dynamic Programming s i,j,k = max s i-1,j-1,k-1 + δv i, w j, k s i-1,j-1,k + δ v i, w j, _ s i-1,j,k-1 1jk1 + δ v i, _, k s i,j-1,k-1 + δ _, w j, k s i-1,j,k 1jk + δ v i, _, _ s i,j-1,k + δ _, w j, _ s i,j,k-1 + δ _, _, k cbe diagonal: no indels face diagonal: one indel edge diagonal: two indels δx, y, z isanentryinthe3-d in the scoring matrix
64 Mltiple Alignment: Rnning Time For 3 seqences of length n, the rn time is 7n 3 ; On 3 For k seqences, bild a k-dimensional matrix, ti with rn time 2 k -1n k ; O2 k n k Conclsion: dynamic programming approach for alignment between two seqences is easily extended d to k seqences bt it is impractical de to exponential rnning time
65 Mltiple Alignment Indces Pairwise Alignments Every mltiple alignment indces pairwise alignments Indces: x: AC-GCGG-C y: AC-GC-GAG z: GCCGC-GAG x: ACGCGG-C; x: AC-GCGG-C; y: AC-GCGAG y: ACGC-GAC; z: GCCGC-GAG; z: GCCGCGAG
66 Reverse Problem: Constrcting Mltiple Alignment from Pairwise Alignments Given 3 arbitrary pairwise alignments: x: ACGCTGG-C; x: AC-GCTGG-C; y: AC-GC-GAG y: ACGC--GAC; z: GCCGCA-GAG; z: GCCGCAGAG can we constrct a mltiple alignment that indces can we constrct a mltiple alignment that indces them?
67 Reverse Problem: Constrcting Mltiple Alignment from Pairwise Alignments Given 3 arbitrary pairwise alignments: x: ACGCTGG-C; x: AC-GCTGG-C; y: AC-GC-GAG y: ACGC--GAC; z: GCCGCA-GAG; z: GCCGCAGAG can we constrct a mltiple alignment that indces them? NOT ALWAYS Pairwise alignments may be inconsistent
68 Inferring Mltiple Alignment from Pairwise Alignments From an optimal mltiple alignment, we can infer pairwise alignments between all pairs of seqences, bt they are not necessarily optimal It is difficlt to infer a good mltiple alignment from optimal pairwise alignments between all seqences
69 Combining Optimal Pairwise Alignments into Mltiple Alignment Can combine pairwise alignments into mltiple alignment Can not combine pairwise alignments into mltiple alignment
70 Consenss String of a Mltiple Alignment - A G G C T A T C A C C T G T A G C T A C C A G C A G C T A C C A G C A G C T A T C A C G G C A G C T A T C G C G G C A G - C T A T C A C - G G Consenss String: C A G C T A T C A C G G The consenss string S M derived from mltiple alignment M is the concatenation of the consenss characters for each colmn of M. The consenss character for colmn i is the character that minimizes the smmed distance to it from all the characters in colmn i. i.e., if match and mismatch scores are eqal for all symbols, the majority symbol is the consenss character
71 Profile Representation of Mltiple Alignment - A G G C T A T C A C C T G T A G C T A C C A G C A G C T A C C A G C A G C T A T C A C G G C A G C T A T C G C G G A C G T
72 Profile Representation of Mltiple Alignment - A G G C T A T C A C C T G T A G C T A C C A G C A G C T A C C A G C A G C T A T C A C G G C A G C T A T C G C G G A C G T Earlier, we were aligning a seqence against a seqence Can we align a seqence against a profile? Can we align a profile against a profile?
73 Aligning alignments Given two alignments, can we align them? x GGGCACTGCAT y GGTTACGTC-- Alignment 1 z GGGAACTGCAG w GGACGTACC-- Alignment 2 v GGACCT-----
74 Aligning alignments Given two alignments, can we align them? Hint: se alignment of corresponding profiles x GGGCACTGCAT y GGTTACGTC-- z GGGAACTGCAG w GGACGTACC-- v GGACCT----- Combined Alignment
75 Mltiple Alignment: Greedy Approach Choose most similar pair of strings and combine into a profile, thereby redcing alignment of k seqences to an alignment of of k-1 seqences/profiles. Repeat This is a heristic greedy method 1 = ACGTACGTACGT 1 = ACg/tTACg/tTACg/cT k 2 = TTAATTAATTAA 3 = ACTACTACTACT 2 = TTAATTAATTAA k = CCGGCCGGCCGG k-1 k = CCGGCCGGCCGG
76 Greedy Approach: Example Consider these 4 seqences s1 s2 s3 s4 GATTCA GTCTGA GATATT GTCAGC
77 Greedy Approach: Example cont d There are = 6 possible alignments s2 GTCTGA s4 GTCAGC score = 2 s1 GAT-TCA s2 G-TCTGA score = 1 s1 GAT-TCA TCA s3 GATAT-T score = 1 s1 GATTCA-- s4 G T-CAGCscore = 0 s2 G-TCTGA s3 GATAT-T score = -1 s3 GAT-ATT ATT s4 G-TCAGC score = -1
78 Greedy Approach: Example cont d s2 s4 s 2 and s 4 are closest; combine: GTCTGA GTCAGC s 2,4 profile GTCt/aGa/cA / new set of 3 seqences: s 1 s 3 s 2,4 GATTCA GATATT GTCt/aGa/c
79 Progressive Alignment Progressive alignment is a variation of greedy algorithm with a somewhat more intelligent strategy for choosing the order of alignments. Progressive alignment works well for close seqences, bt deteriorates for distant seqences Gaps in consenss string are permanent Use profiles to compare seqences
80 Star alignment Heristic method for mltiple l seqence alignments Select a seqence c as the center of the star For each seqence x 1,, x k sch that index i c, perform a Needleman-Wnsch global alignment Aggregate alignments with the principle once a gap, always a gap.
81 Choosing a center Try them all and pick the one which h is most similar il to all of the seqences Let Sx i,xx j be the optimal score between seqences x i and x j. Calclate all Ok 2 alignments, and choose as x c the seqence x i that maximizes the following Σ Sx i,x j j i
82 Star alignment example MPE MSKE S 1 : MPE s 1 s 3 S 2 :MKE MKE M-KE S 3 : MSKE S s 4 : SKE SKE s 2 M-PE M-PE MPE M-KE M-KE s 4 MKE MSKE MSKE S-KE MKE
83 Analysis Assming all seqences have length n Ok 2 n 2 to calclate center Step i of iterative pairwise alignment takes Oi n n time two strings of length n and i n Ok 2 n 2 overall cost
84 ClstalW Most poplar mltiple alignment tool today W stands for weighted different parts of alignment are weighted differently. Three-step process 1. Constrct pairwise alignments 2. Bild Gide Tree by Neighbor Joining method 3. Progressive Alignment gided by the tree - The seqences are aligned progressively according to the branching order in the gide tree
85 Step 1: Pairwise Alignment Aligns each seqence again each other giving a similarity matrix Similarity = exact matches / seqence length percent identity v 1 v 2 v 3 v 4 v 1 - v v v means 17 % identical ca
86 Step 2: Gide Tree Create Gide Tree sing the similarity matrix ClstalW ses the neighbor-joining method Gide tree roghly reflects evoltionary g y y relations
87 Step 2: Gide Tree cont d v 1 v 2 v 3 v 4 v 1 v 3 v 1 - v 4 v v 2 v v Calclate: v 1,3 = alignment v 1, v 3 v 1,3,4 = alignmentv 1,3,v 4 v 1,2,3,4 = alignmentv 1,3,4,v 2
88 Step 3: Progressive Alignment Start by aligning the two most similar seqences Following the gide tree, add in the next seqences, aligning to the existing alignment Insert gaps as necessary FOS_RAT PEEMSVTS-LDLTGGLPEATTPESEEAFTLPLLNDPEPK-PSLEPVKNISNMELKAEPFD PSLEPVKNISNMELKAEPFD FOS_MOUSE PEEMSVAS-LDLTGGLPEASTPESEEAFTLPLLNDPEPK-PSLEPVKSISNVELKAEPFD FOS_CHICK SEELAAATALDLG----APSPAAAEEAFALPLMTEAPPAVPPKEPSG--SGLELKAEPFD FOSB_MOUSE PGPGPLAEVRDLPG-----STSAKEDGFGWLLPPPPPPP LPFQ FOSB_HUMAN PGPGPLAEVRDLPG-----SAPAKEDGFSWLLPPPPPPP LPFQ LPFQ.. : **. :.. *:.* *. * **: Dots and stars show how well-conserved a colmn is.
89 ClstalW: another example S 1 S 2 S 3 S 4 ALSK TNSD NASK NTSD
90 ClstalW example S 1 S 2 S 3 S 4 ALSK TNSD NASK NTSD All pairwise alignments S 1 S 2 S 3 S 4 S S S S 4 0 Distance Matrix
91 ClstalW example S 1 S 2 S 3 S 4 ALSK TNSD NASK NTSD All pairwise alignments S 1 S 2 S 3 S 4 S3 S Neighbor 3 S Joining S S 2 S S S 4 0 Rooted Tree Distance Matrix S 1
92 ClstalW example S 1 ALSK Mltiple Alignment Steps S 2 TNSD S 3 NASK 1. Align S 1 with S 3 2. Align S NTSD 2 with S 4 S 4 All pairwise alignments 3. Align S 1, S 3 with S 2, S 4 S 1 S 2 S 3 S 4 S3 S Neighbor 3 S Joining S S 2 S S S 4 0 Rooted Tree Distance Matrix S 1
93 ClstalW example -ALSK Mltiple Alignment Steps S ALSK 1. Align S 1 with S 1 -ALSK NA-SK 3 S 2 TNSD -TNSD S 3 NASK NA-SK -TNSD 2. Align S 2 with S NT-SD 4 NTSD NT-SD S 4 All pairwise alignments Mltiple Alignment 3. Align S 1, S 3 with S 2, S 4 S 1 S 1 S 2 S 3 S 4 S S S S 4 0 Neighbor Joining Rooted Tree S 3 S 2 S 4 Distance Matrix
94 Other progressive approaches PILEUP Similar to CLUSTALW Uses UPGMA to prodce tree
95 Problems with progressive alignments Depend on pairwise alignments If seqences are very distantly related, mch higher likelihood of errors Care mst be made in choosing scoring matrices and penalties
96
97 Iterative refinement in progressive alignment Another problem of progressive alignment: Initial alignments are frozen even when new evidence comes Example: x: GAAGTT y: GAC-TT Frozen! z: GAACTG GTACTG w: GTACTG Now clear that correct y = GA-CTT
98 Evalating mltiple alignments Balibase benchmark Thompson, 1999 De-facto standard for assessing the qality of a mltiple alignment tool Manally refined mltiple seqence alignments Qality measred by how good it matches the core blocks Another benchmark: SABmark benchmark Based on protein strctral families
99 Scoring mltiple alignments Ideally, a scoring scheme shold Penalize variations in conserved positions higher Relate seqences by a phylogenetic tree Tree alignment Usally assme Independence of colmns Qality comptation Entropy-based scoring Compte the Shannon entropy of each colmn Sm-of-pairs SP score
100 Mltiple Alignments: Scoring Nmber of matches mltiple longest common sbseqence score Entropy score Sm of pairs SP-Score
101 Mltiple LCS Score A colmn is a match if all the letters in the colmn are the same AAA AAA AAT ATC Only good for very similar seqences
102 Entropy Define freqencies for the occrrence of each letter in each colmn of mltiple alignment p A = 1, p T =p G =p C =0 1 st colmn p =075 =025 nd A 0.75, p T 0.25, p G =p C =0 2 colmn p A = 0.50, p T = 0.25, p C =0.25 p G =0 3 rd colmn Compte entropy of each colmn X = A, T, G p X, C log p X AAA AAA AAT ATC
103 Entropy: Example A A entropy A A = 0 Best case entropy A T G C = 1 4 log 1 4 = 4 Worst case = 2
104 Mltiple Alignment: Entropy Score Entropy for a mltiple alignment is the sm of entropies of its colmns: Σ over all colmns -ΣΣ X=A,T,G,C p X logp X
105 Entropy of an Alignment: Example colmn entropy: - p A logp A + p C logp C + p G logp G + p T logp T A A A A C C A C G A C T Colmn 1 = -[1*log1 + 0*log0 + 0*log0 +0*log0] = 0 Colmn 2 = -[ 1 / 4 *log 1 / / 4 *log 3 / 4 + 0*log0 + 0*log0] = -[ 1 / * / 4 *-.415 ] = Colmn 3 = -[ 1 / 4 *log 1 / / 4 *log 1 / / 4 *log 1 / / 4 *log 1 / 4 ] = 4* -[ 1 / 4 *-2] = +2.0 Alignment Entropy = =
106 Mltiple Alignment Indces Pairwise Alignments Every mltiple alignment indces pairwise alignments Indces: x: AC-GCGG-C y: AC-GC-GAG z: GCCGC-GAG x: ACGCGG-C; x: AC-GCGG-C; y: AC-GCGAG y: ACGC-GAC; z: GCCGC-GAG; z: GCCGCGAG
107 Sm of Pairs SP Scoring SP scoring is the standard method for scoring mltiple seqence alignments. Colmns are scored by a sm of pairs fnction sing a sbstittion matrix PAM or BLOSUM Assmes statistical independence for the colmns, does not se a phylogenetic tree.
108 Sm of Pairs ScoreSP-Score Score Consider pairwise alignment of seqences a i and a j imposed by a mltiple alignment of k seqences Denote the score of this sboptimal not necessarily optimal pairwise alignment as s*a i, a j Sm p the pairwise scores for a mltiple alignment: sa 1,,a, k k = Σ ij i,j s*a i, a j j
109 Compting SP-ScoreScore Aligning 4 seqences: 6 pairwise alignments Given a 1,a 2,a 3,a 4 : sa 1 a 4 = Σs*a i,a j = s*a 1,a 2 + s*a 1,a 3 + s*a 1,a 4 + s*a 2,a 3 + s*a 2,a 4 + s*a 3,a 4
110 SP-Score: Score: Example S a a. a 1 a k ATG-C-AAT AAT A-G-CATAT ATCCCATTT * 1... ak = S ai, a j i, j 22 n Pairs of Seqences May also calclate the scores colmn by colmn: A 1 A G 1 Score=3 µ 1 Score = 1 2µ 1 A C µ G Colmn 1 Colmn 3
111 Example Compte Sm of Pairs Score of the following mltiple alignment with match = 3, mismatch = -1, SX,- = -1, S-,- = 0 X: G T A C G Y: T G C C G Z: C G G C C W: C G G A C Sm of pairs = = 10
112 Mltiple alignment tools Clstal W Thompson, 1994 Most poplar PRRP Gotoh, 1993 HMMT Eddy, 1995 DIALIGN Morgenstern, 1998 T-Coffee Notredame, 2000 MUSCLE Edgar, 2004 Align-m Walle, 2004 PROBCONS Do, 2004
113 from: C. Notredame, Recent progresses in mltiple alignment: a srvey, Pharmacogenomics
114 Usefl links / / / shtm
115 Seqence Pattern Discovery High conservation only for short stretches of seqence. Statistical significance may not be high de to short length harder to identify these regions May be important for strctre and fnction and can be sed to identify diverged members of protein families.
116 Seqence Pattern Discovery From mltiple seqence alignments By searching for possible patterns in the set y g p p of seqences
117 Pattern Discovery from MSA emotif Uses 20 grops of amino acids to denote amino acids that can be sbstitted by each other For each colmn of the alignment determine which single grop can cover the whole colmn If cannot find a single grop for all seqences, look for a single grop for at least %30 of the seqences. By examining the possible colmn combinations, identify patterns Compte statistical significance by comparing to a backgrond distribtion
118 Pattern Discovery from MSA AACC Amino Acid Class Covering Uses an alternative set of gropings Similar to progressive MSA techniqes Use the gropings to represent the intermediate alignments. If they cannot be groped se the symbol X to denote the colmn is not conserved. The final alignment shows the seqence patterns common to all seqences can only find flly conserved patterns
119 Pattern Search in Unaligned Seqences Gibbs Uses a probabilistic model to search for a pattern of length W in a set of N seqences. Uses a model similar to PSSM constrction In addition the start of the pattern in each seqence is inclded d in the model Starts with a random pattern and iteratively refines the random patterns into an emergent pattern abot 100N iterations The starting gp positions are random first, the most likely starting positions are fond at each step and the profile model is also modified accordingly Another techniqe MEME is similar and ses the EM method.
Patterns, Profiles, and
Patterns, Profiles, and Mltiple l Alignments Otline Profiles, Position Specific Scoring Matrices Profile Hidden Markov Models Alignment of Profiles Mltiple Alignment Algorithms Problem definition Can we
More informationMultiple Sequence Alignment
Multiple Sequence Alignment Multiple Alignment versus Pairwise Alignment Up until now we have only tried to align two sequences.! What about more than two? And what for?! A faint similarity between two
More informationMultiple Sequence Alignment
Multiple Sequence Alignment Multiple Alignment versus Pairwise Alignment Up until now we have only tried to align two sequences. What about more than two? And what for? A faint similarity between two sequences
More informationMultiple Alignment. Slides revised and adapted to Bioinformática IST Ana Teresa Freitas
n Introduction to Bioinformatics lgorithms Multiple lignment Slides revised and adapted to Bioinformática IS 2005 na eresa Freitas n Introduction to Bioinformatics lgorithms Outline Dynamic Programming
More informationp(-,i)+p(,i)+p(-,v)+p(i,v),v)+p(i,v)
Multile Sequence Alignment Given: Set of sequences Score matrix Ga enalties Find: Alignment of sequences such that otimal score is achieved. Motivation Aligning rotein families Establish evolutionary relationshis
More informationSection 7.4: Integration of Rational Functions by Partial Fractions
Section 7.4: Integration of Rational Fnctions by Partial Fractions This is abot as complicated as it gets. The Method of Partial Fractions Ecept for a few very special cases, crrently we have no way to
More informationFEA Solution Procedure
EA Soltion Procedre (demonstrated with a -D bar element problem) EA Procedre for Static Analysis. Prepare the E model a. discretize (mesh) the strctre b. prescribe loads c. prescribe spports. Perform calclations
More informationAn Investigation into Estimating Type B Degrees of Freedom
An Investigation into Estimating Type B Degrees of H. Castrp President, Integrated Sciences Grop Jne, 00 Backgrond The degrees of freedom associated with an ncertainty estimate qantifies the amont of information
More informationBayes and Naïve Bayes Classifiers CS434
Bayes and Naïve Bayes Classifiers CS434 In this lectre 1. Review some basic probability concepts 2. Introdce a sefl probabilistic rle - Bayes rle 3. Introdce the learning algorithm based on Bayes rle (ths
More informationEssentials of optimal control theory in ECON 4140
Essentials of optimal control theory in ECON 4140 Things yo need to know (and a detail yo need not care abot). A few words abot dynamic optimization in general. Dynamic optimization can be thoght of as
More informationDepartment of Industrial Engineering Statistical Quality Control presented by Dr. Eng. Abed Schokry
Department of Indstrial Engineering Statistical Qality Control presented by Dr. Eng. Abed Schokry Department of Indstrial Engineering Statistical Qality Control C and U Chart presented by Dr. Eng. Abed
More informationBasics on bioinforma-cs Lecture 7. Nunzio D Agostino
Basics on bioinforma-cs Lecture 7 Nunzio D Agostino nunzio.dagostino@entecra.it; nunzio.dagostino@gmail.com Multiple alignments One sequence plays coy a pair of homologous sequence whisper many aligned
More informationClassify by number of ports and examine the possible structures that result. Using only one-port elements, no more than two elements can be assembled.
Jnction elements in network models. Classify by nmber of ports and examine the possible strctres that reslt. Using only one-port elements, no more than two elements can be assembled. Combining two two-ports
More informationSources of Non Stationarity in the Semivariogram
Sorces of Non Stationarity in the Semivariogram Migel A. Cba and Oy Leangthong Traditional ncertainty characterization techniqes sch as Simple Kriging or Seqential Gassian Simlation rely on stationary
More informationSequencing alignment Ameer Effat M. Elfarash
Sequencing alignment Ameer Effat M. Elfarash Dept. of Genetics Fac. of Agriculture, Assiut Univ. aelfarash@aun.edu.eg Why perform a multiple sequence alignment? MSAs are at the heart of comparative genomics
More informationDiscontinuous Fluctuation Distribution for Time-Dependent Problems
Discontinos Flctation Distribtion for Time-Dependent Problems Matthew Hbbard School of Compting, University of Leeds, Leeds, LS2 9JT, UK meh@comp.leeds.ac.k Introdction For some years now, the flctation
More informationTechnical Note. ODiSI-B Sensor Strain Gage Factor Uncertainty
Technical Note EN-FY160 Revision November 30, 016 ODiSI-B Sensor Strain Gage Factor Uncertainty Abstract Lna has pdated or strain sensor calibration tool to spport NIST-traceable measrements, to compte
More informationSequencing alignment Ameer Effat M. Elfarash
Sequencing alignment Ameer Effat M. Elfarash Dept. of Genetics Fac. of Agriculture, Assiut Univ. amir_effat@yahoo.com Why perform a multiple sequence alignment? MSAs are at the heart of comparative genomics
More informationPage 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence
Page Hidden Markov models and multiple sequence alignment Russ B Altman BMI 4 CS 74 Some slides borrowed from Scott C Schmidler (BMI graduate student) References Bioinformatics Classic: Krogh et al (994)
More informationBLOOM S TAXONOMY. Following Bloom s Taxonomy to Assess Students
BLOOM S TAXONOMY Topic Following Bloom s Taonomy to Assess Stdents Smmary A handot for stdents to eplain Bloom s taonomy that is sed for item writing and test constrction to test stdents to see if they
More informationMultiple Sequence Alignment using Profile HMM
Multiple Sequence Alignment using Profile HMM. based on Chapter 5 and Section 6.5 from Biological Sequence Analysis by R. Durbin et al., 1998 Acknowledgements: M.Sc. students Beatrice Miron, Oana Răţoi,
More informationTHEORY. Based on sequence Length According to the length of sequence being compared it is of following two types
Exp 11- THEORY Sequence Alignment is a process of aligning two sequences to achieve maximum levels of identity between them. This help to derive functional, structural and evolutionary relationships between
More informationChapter 4 Supervised learning:
Chapter 4 Spervised learning: Mltilayer Networks II Madaline Other Feedforward Networks Mltiple adalines of a sort as hidden nodes Weight change follows minimm distrbance principle Adaptive mlti-layer
More informationMath 116 First Midterm October 14, 2009
Math 116 First Midterm October 14, 9 Name: EXAM SOLUTIONS Instrctor: Section: 1. Do not open this exam ntil yo are told to do so.. This exam has 1 pages inclding this cover. There are 9 problems. Note
More informationTed Pedersen. Southern Methodist University. large sample assumptions implicit in traditional goodness
Appears in the Proceedings of the Soth-Central SAS Users Grop Conference (SCSUG-96), Astin, TX, Oct 27-29, 1996 Fishing for Exactness Ted Pedersen Department of Compter Science & Engineering Sothern Methodist
More informationLecture Notes: Finite Element Analysis, J.E. Akin, Rice University
9. TRUSS ANALYSIS... 1 9.1 PLANAR TRUSS... 1 9. SPACE TRUSS... 11 9.3 SUMMARY... 1 9.4 EXERCISES... 15 9. Trss analysis 9.1 Planar trss: The differential eqation for the eqilibrim of an elastic bar (above)
More informationSensitivity Analysis in Bayesian Networks: From Single to Multiple Parameters
Sensitivity Analysis in Bayesian Networks: From Single to Mltiple Parameters Hei Chan and Adnan Darwiche Compter Science Department University of California, Los Angeles Los Angeles, CA 90095 {hei,darwiche}@cs.cla.ed
More information5. MULTIPLE SEQUENCE ALIGNMENT BIOINFORMATICS COURSE MTAT
5. MULTIPLE SEQUENCE ALIGNMENT BIOINFORMATICS COURSE MTAT.03.239 03.10.2012 ALIGNMENT Alignment is the task of locating equivalent regions of two or more sequences to maximize their similarity. Homology:
More informationA Characterization of Interventional Distributions in Semi-Markovian Causal Models
A Characterization of Interventional Distribtions in Semi-Markovian Casal Models Jin Tian and Changsng Kang Department of Compter Science Iowa State University Ames, IA 50011 {jtian, cskang}@cs.iastate.ed
More informationUNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL
8th International DAAAM Baltic Conference "INDUSTRIAL ENGINEERING - 19-1 April 01, Tallinn, Estonia UNCERTAINTY FOCUSED STRENGTH ANALYSIS MODEL Põdra, P. & Laaneots, R. Abstract: Strength analysis is a
More informationHigher Maths A1.3 Recurrence Relations - Revision
Higher Maths A Recrrence Relations - Revision This revision pack covers the skills at Unit Assessment exam level or Recrrence Relations so yo can evalate yor learning o this otcome It is important that
More informationEXERCISES WAVE EQUATION. In Problems 1 and 2 solve the heat equation (1) subject to the given conditions. Assume a rod of length L.
.4 WAVE EQUATION 445 EXERCISES.3 In Problems and solve the heat eqation () sbject to the given conditions. Assme a rod of length.. (, t), (, t) (, ),, > >. (, t), (, t) (, ) ( ) 3. Find the temperatre
More informationThe Cross Product of Two Vectors in Space DEFINITION. Cross Product. u * v = s ƒ u ƒƒv ƒ sin ud n
12.4 The Cross Prodct 873 12.4 The Cross Prodct In stdying lines in the plane, when we needed to describe how a line was tilting, we sed the notions of slope and angle of inclination. In space, we want
More informationm = Average Rate of Change (Secant Slope) Example:
Average Rate o Change Secant Slope Deinition: The average change secant slope o a nction over a particlar interval [a, b] or [a, ]. Eample: What is the average rate o change o the nction over the interval
More informationSTEP Support Programme. STEP III Hyperbolic Functions: Solutions
STEP Spport Programme STEP III Hyperbolic Fnctions: Soltions Start by sing the sbstittion t cosh x. This gives: sinh x cosh a cosh x cosh a sinh x t sinh x dt t dt t + ln t ln t + ln cosh a ln ln cosh
More informationDecision Making in Complex Environments. Lecture 2 Ratings and Introduction to Analytic Network Process
Decision Making in Complex Environments Lectre 2 Ratings and Introdction to Analytic Network Process Lectres Smmary Lectre 5 Lectre 1 AHP=Hierar chies Lectre 3 ANP=Networks Strctring Complex Models with
More information3.1 The Basic Two-Level Model - The Formulas
CHAPTER 3 3 THE BASIC MULTILEVEL MODEL AND EXTENSIONS In the previos Chapter we introdced a nmber of models and we cleared ot the advantages of Mltilevel Models in the analysis of hierarchically nested
More informationCopyright 2000 N. AYDIN. All rights reserved. 1
Introduction to Bioinformatics Prof. Dr. Nizamettin AYDIN naydin@yildiz.edu.tr Multiple Sequence Alignment Outline Multiple sequence alignment introduction to msa methods of msa progressive global alignment
More informationL = 2 λ 2 = λ (1) In other words, the wavelength of the wave in question equals to the string length,
PHY 309 L. Soltions for Problem set # 6. Textbook problem Q.20 at the end of chapter 5: For any standing wave on a string, the distance between neighboring nodes is λ/2, one half of the wavelength. The
More informationSetting The K Value And Polarization Mode Of The Delta Undulator
LCLS-TN-4- Setting The Vale And Polarization Mode Of The Delta Undlator Zachary Wolf, Heinz-Dieter Nhn SLAC September 4, 04 Abstract This note provides the details for setting the longitdinal positions
More informationPrandl established a universal velocity profile for flow parallel to the bed given by
EM 0--00 (Part VI) (g) The nderlayers shold be at least three thicknesses of the W 50 stone, bt never less than 0.3 m (Ahrens 98b). The thickness can be calclated sing Eqation VI-5-9 with a coefficient
More informationThe Dual of the Maximum Likelihood Method
Department of Agricltral and Resorce Economics University of California, Davis The Dal of the Maximm Likelihood Method by Qirino Paris Working Paper No. 12-002 2012 Copyright @ 2012 by Qirino Paris All
More informationInformation Source Detection in the SIR Model: A Sample Path Based Approach
Information Sorce Detection in the SIR Model: A Sample Path Based Approach Kai Zh and Lei Ying School of Electrical, Compter and Energy Engineering Arizona State University Tempe, AZ, United States, 85287
More informationMoreover, the circular logic
Moreover, the circular logic How do we know what is the right distance without a good alignment? And how do we construct a good alignment without knowing what substitutions were made previously? ATGCGT--GCAAGT
More informationAlgorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment
Algorithms in Bioinformatics FOUR Sami Khuri Department of Computer Science San José State University Pairwise Sequence Alignment Homology Similarity Global string alignment Local string alignment Dot
More informationStudy on the impulsive pressure of tank oscillating by force towards multiple degrees of freedom
EPJ Web of Conferences 80, 0034 (08) EFM 07 Stdy on the implsive pressre of tank oscillating by force towards mltiple degrees of freedom Shigeyki Hibi,* The ational Defense Academy, Department of Mechanical
More informationPair Hidden Markov Models
Pair Hidden Markov Models Scribe: Rishi Bedi Lecturer: Serafim Batzoglou January 29, 2015 1 Recap of HMMs alphabet: Σ = {b 1,...b M } set of states: Q = {1,..., K} transition probabilities: A = [a ij ]
More informationDNA and protein databases. EMBL/GenBank/DDBJ database of nucleic acids
Database searches 1 DNA and protein databases EMBL/GenBank/DDBJ database of nucleic acids 2 DNA and protein databases EMBL/GenBank/DDBJ database of nucleic acids (cntd) 3 DNA and protein databases SWISS-PROT
More informationMultiple Sequence Alignment: A Critical Comparison of Four Popular Programs
Multiple Sequence Alignment: A Critical Comparison of Four Popular Programs Shirley Sutton, Biochemistry 218 Final Project, March 14, 2008 Introduction For both the computational biologist and the research
More informationStatistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Bioinformatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD Department of Computer Science University of Missouri 2008 Free for Academic
More informationEXPT. 5 DETERMINATION OF pk a OF AN INDICATOR USING SPECTROPHOTOMETRY
EXPT. 5 DETERMITIO OF pk a OF IDICTOR USIG SPECTROPHOTOMETRY Strctre 5.1 Introdction Objectives 5.2 Principle 5.3 Spectrophotometric Determination of pka Vale of Indicator 5.4 Reqirements 5.5 Soltions
More informationOptimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications
Optimization via the Hamilton-Jacobi-Bellman Method: Theory and Applications Navin Khaneja lectre notes taken by Christiane Koch Jne 24, 29 1 Variation yields a classical Hamiltonian system Sppose that
More informationIntrodction Finite elds play an increasingly important role in modern digital commnication systems. Typical areas of applications are cryptographic sc
A New Architectre for a Parallel Finite Field Mltiplier with Low Complexity Based on Composite Fields Christof Paar y IEEE Transactions on Compters, Jly 996, vol 45, no 7, pp 856-86 Abstract In this paper
More informationPhysicsAndMathsTutor.com
C Integration - By sbstittion PhysicsAndMathsTtor.com. Using the sbstittion cos +, or otherwise, show that e cos + sin d e(e ) (Total marks). (a) Using the sbstittion cos, or otherwise, find the eact vale
More information1. Tractable and Intractable Computational Problems So far in the course we have seen many problems that have polynomial-time solutions; that is, on
. Tractable and Intractable Comptational Problems So far in the corse we have seen many problems that have polynomial-time soltions; that is, on a problem instance of size n, the rnning time T (n) = O(n
More informationDiscussion of The Forward Search: Theory and Data Analysis by Anthony C. Atkinson, Marco Riani, and Andrea Ceroli
1 Introdction Discssion of The Forward Search: Theory and Data Analysis by Anthony C. Atkinson, Marco Riani, and Andrea Ceroli Søren Johansen Department of Economics, University of Copenhagen and CREATES,
More informationVIBRATION MEASUREMENT UNCERTAINTY AND RELIABILITY DIAGNOSTICS RESULTS IN ROTATING SYSTEMS
VIBRATIO MEASUREMET UCERTAITY AD RELIABILITY DIAGOSTICS RESULTS I ROTATIG SYSTEMS. Introdction M. Eidkevicite, V. Volkovas anas University of Technology, Lithania The rotating machinery technical state
More informationBIOSTATISTICAL METHODS
BIOSTATISTICAL METHOS FOR TRANSLATIONAL & CLINICAL RESEARCH ROC Crve: IAGNOSTIC MEICINE iagnostic tests have been presented as alwas having dichotomos otcomes. In some cases, the reslt of the test ma be
More informationMultiple sequence alignment
Multiple sequence alignment Multiple sequence alignment: today s goals to define what a multiple sequence alignment is and how it is generated; to describe profile HMMs to introduce databases of multiple
More informationFEA Solution Procedure
EA Soltion Procedre (demonstrated with a -D bar element problem) MAE 5 - inite Element Analysis Several slides from this set are adapted from B.S. Altan, Michigan Technological University EA Procedre for
More informationFOUNTAIN codes [3], [4] provide an efficient solution
Inactivation Decoding of LT and Raptor Codes: Analysis and Code Design Francisco Lázaro, Stdent Member, IEEE, Gianligi Liva, Senior Member, IEEE, Gerhard Bach, Fellow, IEEE arxiv:176.5814v1 [cs.it 19 Jn
More informationEECS730: Introduction to Bioinformatics
EECS730: Introduction to Bioinformatics Lecture 07: profile Hidden Markov Model http://bibiserv.techfak.uni-bielefeld.de/sadr2/databasesearch/hmmer/profilehmm.gif Slides adapted from Dr. Shaojie Zhang
More informationStatistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences
Statistical Machine Learning Methods for Biomedical Informatics II. Hidden Markov Model for Biological Sequences Jianlin Cheng, PhD William and Nancy Thompson Missouri Distinguished Professor Department
More information1 The space of linear transformations from R n to R m :
Math 540 Spring 20 Notes #4 Higher deriaties, Taylor s theorem The space of linear transformations from R n to R m We hae discssed linear transformations mapping R n to R m We can add sch linear transformations
More informationANOVA INTERPRETING. It might be tempting to just look at the data and wing it
Introdction to Statistics in Psychology PSY 2 Professor Greg Francis Lectre 33 ANalysis Of VAriance Something erss which thing? ANOVA Test statistic: F = MS B MS W Estimated ariability from noise and mean
More informationTheoretical and Experimental Implementation of DC Motor Nonlinear Controllers
Theoretical and Experimental Implementation of DC Motor Nonlinear Controllers D.R. Espinoza-Trejo and D.U. Campos-Delgado Facltad de Ingeniería, CIEP, UASLP, espinoza trejo dr@aslp.mx Facltad de Ciencias,
More informationCONTENTS. INTRODUCTION MEQ curriculum objectives for vectors (8% of year). page 2 What is a vector? What is a scalar? page 3, 4
CONTENTS INTRODUCTION MEQ crriclm objectives for vectors (8% of year). page 2 What is a vector? What is a scalar? page 3, 4 VECTOR CONCEPTS FROM GEOMETRIC AND ALGEBRAIC PERSPECTIVES page 1 Representation
More informationUniversal Scheme for Optimal Search and Stop
Universal Scheme for Optimal Search and Stop Sirin Nitinawarat Qalcomm Technologies, Inc. 5775 Morehose Drive San Diego, CA 92121, USA Email: sirin.nitinawarat@gmail.com Vengopal V. Veeravalli Coordinated
More informationUncertainties of measurement
Uncertainties of measrement Laboratory tas A temperatre sensor is connected as a voltage divider according to the schematic diagram on Fig.. The temperatre sensor is a thermistor type B5764K [] with nominal
More informationApproximate Solution for the System of Non-linear Volterra Integral Equations of the Second Kind by using Block-by-block Method
Astralian Jornal of Basic and Applied Sciences, (1): 114-14, 008 ISSN 1991-8178 Approximate Soltion for the System of Non-linear Volterra Integral Eqations of the Second Kind by sing Block-by-block Method
More informationA Note on Irreducible Polynomials and Identity Testing
A Note on Irrecible Polynomials an Ientity Testing Chanan Saha Department of Compter Science an Engineering Inian Institte of Technology Kanpr Abstract We show that, given a finite fiel F q an an integer
More information10.2 Solving Quadratic Equations by Completing the Square
. Solving Qadratic Eqations b Completing the Sqare Consider the eqation ( ) We can see clearl that the soltions are However, What if the eqation was given to s in standard form, that is 6 How wold we go
More informationDiscussion Papers Department of Economics University of Copenhagen
Discssion Papers Department of Economics University of Copenhagen No. 10-06 Discssion of The Forward Search: Theory and Data Analysis, by Anthony C. Atkinson, Marco Riani, and Andrea Ceroli Søren Johansen,
More information11.3 Decoding Algorithm
11.3 Decoding Algorithm 393 For convenience, we have introduced π 0 and π n+1 as the fictitious initial and terminal states begin and end. This model defines the probability P(x π) for a given sequence
More informationIMPROVED ANALYSIS OF BOLTED SHEAR CONNECTION UNDER ECCENTRIC LOADS
Jornal of Marine Science and Technology, Vol. 5, No. 4, pp. 373-38 (17) 373 DOI: 1.6119/JMST-17-3-1 IMPROVED ANALYSIS OF BOLTED SHEAR ONNETION UNDER EENTRI LOADS Dng-Mya Le 1, heng-yen Liao, hien-hien
More informationLecture Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2
BIJU PATNAIK UNIVERSITY OF TECHNOLOGY, ODISHA Lectre Notes On THEORY OF COMPUTATION MODULE - 2 UNIT - 2 Prepared by, Dr. Sbhend Kmar Rath, BPUT, Odisha. Tring Machine- Miscellany UNIT 2 TURING MACHINE
More informationCONCEPT OF SEQUENCE COMPARISON. Natapol Pornputtapong 18 January 2018
CONCEPT OF SEQUENCE COMPARISON Natapol Pornputtapong 18 January 2018 SEQUENCE ANALYSIS - A ROSETTA STONE OF LIFE Sequence analysis is the process of subjecting a DNA, RNA or peptide sequence to any of
More informationAssignment Fall 2014
Assignment 5.086 Fall 04 De: Wednesday, 0 December at 5 PM. Upload yor soltion to corse website as a zip file YOURNAME_ASSIGNMENT_5 which incldes the script for each qestion as well as all Matlab fnctions
More informationSimplified Identification Scheme for Structures on a Flexible Base
Simplified Identification Scheme for Strctres on a Flexible Base L.M. Star California State University, Long Beach G. Mylonais University of Patras, Greece J.P. Stewart University of California, Los Angeles
More informationA New Approach to Direct Sequential Simulation that Accounts for the Proportional Effect: Direct Lognormal Simulation
A ew Approach to Direct eqential imlation that Acconts for the Proportional ffect: Direct ognormal imlation John Manchk, Oy eangthong and Clayton Detsch Department of Civil & nvironmental ngineering University
More informationMultiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:
Multiple Sequence Alignment, Gunnar Klau, December 9, 2005, 17:50 5001 5 Multiple Sequence Alignment The first part of this exposition is based on the following sources, which are recommended reading:
More information1 Undiscounted Problem (Deterministic)
Lectre 9: Linear Qadratic Control Problems 1 Undisconted Problem (Deterministic) Choose ( t ) 0 to Minimize (x trx t + tq t ) t=0 sbject to x t+1 = Ax t + B t, x 0 given. x t is an n-vector state, t a
More information3.4-Miscellaneous Equations
.-Miscellaneos Eqations Factoring Higher Degree Polynomials: Many higher degree polynomials can be solved by factoring. Of particlar vale is the method of factoring by groping, however all types of factoring
More informationHidden Markov Models. Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98)
Hidden Markov Models Main source: Durbin et al., Biological Sequence Alignment (Cambridge, 98) 1 The occasionally dishonest casino A P A (1) = P A (2) = = 1/6 P A->B = P B->A = 1/10 B P B (1)=0.1... P
More informationVectors in Rn un. This definition of norm is an extension of the Pythagorean Theorem. Consider the vector u = (5, 8) in R 2
MATH 307 Vectors in Rn Dr. Neal, WKU Matrices of dimension 1 n can be thoght of as coordinates, or ectors, in n- dimensional space R n. We can perform special calclations on these ectors. In particlar,
More informationPrediction of Transmission Distortion for Wireless Video Communication: Analysis
Prediction of Transmission Distortion for Wireless Video Commnication: Analysis Zhifeng Chen and Dapeng W Department of Electrical and Compter Engineering, University of Florida, Gainesville, Florida 326
More informationCHANNEL SELECTION WITH RAYLEIGH FADING: A MULTI-ARMED BANDIT FRAMEWORK. Wassim Jouini and Christophe Moy
CHANNEL SELECTION WITH RAYLEIGH FADING: A MULTI-ARMED BANDIT FRAMEWORK Wassim Joini and Christophe Moy SUPELEC, IETR, SCEE, Avene de la Bolaie, CS 47601, 5576 Cesson Sévigné, France. INSERM U96 - IFR140-
More informationStephen Scott.
1 / 21 sscott@cse.unl.edu 2 / 21 Introduction Designed to model (profile) a multiple alignment of a protein family (e.g., Fig. 5.1) Gives a probabilistic model of the proteins in the family Useful for
More informationFrequency Estimation, Multiple Stationary Nonsinusoidal Resonances With Trend 1
Freqency Estimation, Mltiple Stationary Nonsinsoidal Resonances With Trend 1 G. Larry Bretthorst Department of Chemistry, Washington University, St. Lois MO 6313 Abstract. In this paper, we address the
More informationSign-reductions, p-adic valuations, binomial coefficients modulo p k and triangular symmetries
Sign-redctions, p-adic valations, binomial coefficients modlo p k and trianglar symmetries Mihai Prnesc Abstract According to a classical reslt of E. Kmmer, the p-adic valation v p applied to a binomial
More informationLecture 3. (2) Last time: 3D space. The dot product. Dan Nichols January 30, 2018
Lectre 3 The dot prodct Dan Nichols nichols@math.mass.ed MATH 33, Spring 018 Uniersity of Massachsetts Janary 30, 018 () Last time: 3D space Right-hand rle, the three coordinate planes 3D coordinate system:
More informationCollective Inference on Markov Models for Modeling Bird Migration
Collective Inference on Markov Models for Modeling Bird Migration Daniel Sheldon Cornell University dsheldon@cs.cornell.ed M. A. Saleh Elmohamed Cornell University saleh@cam.cornell.ed Dexter Kozen Cornell
More informationChange of Variables. (f T) JT. f = U
Change of Variables 4-5-8 The change of ariables formla for mltiple integrals is like -sbstittion for single-ariable integrals. I ll gie the general change of ariables formla first, and consider specific
More informationLinear System Theory (Fall 2011): Homework 1. Solutions
Linear System Theory (Fall 20): Homework Soltions De Sep. 29, 20 Exercise (C.T. Chen: Ex.3-8). Consider a linear system with inpt and otpt y. Three experiments are performed on this system sing the inpts
More informationA Model-Free Adaptive Control of Pulsed GTAW
A Model-Free Adaptive Control of Plsed GTAW F.L. Lv 1, S.B. Chen 1, and S.W. Dai 1 Institte of Welding Technology, Shanghai Jiao Tong University, Shanghai 00030, P.R. China Department of Atomatic Control,
More informationJoint Transfer of Energy and Information in a Two-hop Relay Channel
Joint Transfer of Energy and Information in a Two-hop Relay Channel Ali H. Abdollahi Bafghi, Mahtab Mirmohseni, and Mohammad Reza Aref Information Systems and Secrity Lab (ISSL Department of Electrical
More informationLecture 14: Multiple Sequence Alignment (Gene Finding, Conserved Elements) Scribe: John Ekins
Lecture 14: Multiple Sequence Alignment (Gene Finding, Conserved Elements) 2 19 2015 Scribe: John Ekins Multiple Sequence Alignment Given N sequences x 1, x 2,, x N : Insert gaps in each of the sequences
More informationControl Systems
6.5 Control Systems Last Time: Introdction Motivation Corse Overview Project Math. Descriptions of Systems ~ Review Classification of Systems Linear Systems LTI Systems The notion of state and state variables
More informationPHASE STEERING AND FOCUSING BEHAVIOR OF ULTRASOUND IN CEMENTITIOUS MATERIALS
PHAS STRING AND FOCUSING BHAVIOR OF ULTRASOUND IN CMNTITIOUS MATRIALS Shi-Chang Wooh and Lawrence Azar Department of Civil and nvironmental ngineering Massachsetts Institte of Technology Cambridge, MA
More informationCosmic Microwave Background Radiation. Carl W. Akerlof April 7, 2013
Cosmic Microwave Backgrond Radiation Carl W. Akerlof April 7, 013 Notes: Dry ice sblimation temperatre: Isopropyl alcohol freezing point: LNA operating voltage: 194.65 K 184.65 K 18.0 v he terrestrial
More information