7 Protein secondary structure
|
|
- Brendan Hamilton
- 6 years ago
- Views:
Transcription
1 78 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 7 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze, V.V. Solovyev and I.N. Shindyalov. Properties and prediction of protein secondary structure. In Current Topics in Computational Molecular Biology, T. Jiang, Y. Xu and M.Q. Zhang (editors), MIT press, chapter 15, pages , 22. D.W. Mount. Bioinformatics: Sequences and Genome analysis, Cold Spring Harbor Press, Chapter 9: Protein classification and structure prediction. pages , Proteins A protein is a chain of amino acids joined by peptide bonds. It is usually produced by a ribosome that moves along an mrna and adds amino acids according to the codons that it encounters in the mrna. Here are the 2 standard amino acids: Name 3-letter 1-letter Alanine Ala A Cysteine Cys C Aspartic acid Asp D Glutamic acid Glu E Phenylalanine Phe F Glycine Gly G Histidine His H Isoleucin Ile I Lysine Lys K Leucine Leu L Here is a classification of these amino acids: Name 3-letter 1-letter Methionine Met M Asparagine Asn N Proline Pro P Glutamine Gln Q Arginine Arg R Serine Ser S Threonine Thr T Valine Val V Tryptophan Trp W Tyrosine Tyr Y Here are two amino acids within a polypeptide chain:
2 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, R H H O N # C "! N C C H H O R Neighboring amino acids are joined by a peptide bond between the C=O and NH groups. A chain of repeated N-C α -C s make up the backbone of the protein. In such a polypeptide chain, each amino acid has two rotational degrees of freedom: the rotational angle φ ( phi ) of the bond between N and C α, and the rotational angle ψ ( psi ) of the bond between C α and C. Both bonds are free to rotate, subject to spatial constraints posed by adjacent R groups. The third angle Ω of the peptide bond between the C=O and NH groups is nearly always 18, which implies the planarity of the peptide bond. Polypeptide chains of amino-acids are called protein sequences. called a peptide sequence. A short chain or fragment is also A protein (sequence) starts with a free NH group (the N-terminus) and ends with a free COOH group (the C-terminus). Example: N-terminus C-terminus The amino acid sequence here is: R 1 R 2 R Hierarchy of protein structure We distinguish between four levels of protein structure (Linderstrom-Lang & Schnellman 1959): Primary structure: The sequence of amino acid residues in a polypeptide chain. Secondary structure: Helices and β-sheets that are formed by hydrogen bonds between the C=O and NH groups of the backbone.
3 8 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 Tertiary structure: The three dimension structure of a polypeptide chain, consisting of secondary structure elements linked by loops and stabilized (primarily) by side-chain interactions. Quaternary structure: The aggregation of different polypeptide chains into a functional protein. From Protein Structure and Function, by GA Petsko and D Ringe The values of all pairs of rotation angles φ and ψ determines the tertiary structure of a protein. The tertiary structure of proteins is of great interest, as the shape of a protein determines much, if not all, of its function. Here is the structure of myoglobin, the first experimentally derived structure: The experimental determination of protein structure via x-ray crystallography or NMR is difficult and time-consuming. 7.2 The Holy Grail of Bioinformatics Central biochemical assumption: sequence specifies 3D-structure. Hence, we would like to be able to determine the structure of a protein from its sequence. The Holy Grail of Bioinformatics: Develop an algorithm that can reliably predict the structure (and thus function) of a protein from its amino acid sequence
4 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, MATGDERFYAEHLMPTLQGLLDPESAHR LAVRFTSLGLLPRARFQDSDMLEVRVLGH KFRNPVGIAAGFDKHGEAVDGLYKMGFGF VEIGSVTPKPQEGNPRPRVFRLPEDQAVIN RYGFNSHGLSVVEHRLRARQQKQAKLTED GLPLGVNLGKNKTSVDAAEDYAEGVRVLG PLADYLVVNVSSPNTAGLRSLQGKAELRR LLTKVLQERDGLRRVHRPAVLVKIAPDLTS QDKEDIASVVKELGIDGLIVTNTTVSRPAGL QGALRSETGGLSGKPLRDLSTQTIREMYAL TQGRVPIIGVGGVSSGQDALEKIRAGASLVQ LYTALTFWGPPVVGKVKRELEALLKEQGFG GVTDAIGADHRR...? We will return to this problem in the next chapter Secondary structure of proteins Regular features of the main chain of a protein give rise to the secondary structure. Determining the secondary structure is an important first step toward determining the threedimensional structure. There are two main types of (repetitive) secondary structure elements, called α-helices and β-sheets (L. Pauling 1951), corresponding to specific choices of the φ and ψ angles along the chain Ramachandran Plot In a Ramachandran plot pairs of torsion angles (φ,ψ) are plotted in a scatter plot. Certain torsion angle pairs are energetically particularly favorable. The following is a Ramachandran plot of observed pairs of angles in a collection of known protein structures: The pairs near φ = 6 and ψ = 4 correspond to α-helices. The pairs near ( 9, 12 ) correspond to β-strands α-helices Helices arise when hydrogen bonds occur between (the C=O group of) the amino acid at position i and (the NH group of) the amino acid at position i + k (with k = 3, 4 or 5), for a run of consecutive values of i. Here is the bonding pattern of an α-helix:
5 82 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 NH C=O NH C=O C=O NH C=O NH C=O C=O NH NH C=O NH C NH! C=O Usually, k = 4 and the resulting structure is called an α-helix. (φ, ψ) = ( 58, 47) and there are 3.6 residues per turn. The (idealized) torsion angles are Seldomly, k = 3 and then we have a 3 1 -helix. The (idealized) torsion angles are (-74, -4) and there are 3. residues per turn. Very rarely, k = 5 and then we have a π-helix. The idealized torsion angles are (-57, -7) and there are 4.4 residues per turn β-sheets So-called β-sheets consist of β-strands that are runs of 5-1 consecutive amino acids, which are held together by H bonds: There are two possible configurations of β-sheets. In a parallel β-sheet, all chains run in the same direction, while in an anti-parallel sheet, chains run
6 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, in alternating directions: Example of an anti-parallel β-sheet (variable light chain of an immunoglobulin): Loops All other (non-repetitive) structures are called loops. Loops are regions of a protein chain that lie between α-helices and β-sheets. The lengths and threedimensional structure of loops can vary. Hairpin loops joining two anti-parallel β-strands may be as short as two amino acids. Loops lie on the surface of the structure. Turns are narrow 18 loops that contain at least 3 amino acids. A region of secondary structure that is not a helix, a sheet, or a recognizable turn is called a coil. 7.4 Classification of protein structures Proteins are classified to reflect both structural and evolutionary relatedness. A typical classification scheme will employ different hierarchical levels, such as: 1. Folds: Based on major structural similarities. 2. Superfamilies: Based on probable evolutionary relationships.
7 84 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, Families: Based on clear evolutionary relationships. Mount describes six principal classes of protein structures based on the three-dimensional arrangement of secondary structures, four taken from Levitt and Chothia (1976), and two additional ones taken from the SCOP database (Murzin et al., 1995): (1) A member of class α consists of a bundle of α-helices connected by loops on the surface of the proteins, e.g.: (The four letter codes are PDB accession numbers.) Hemoglobin (3hhb) (2) A member of class β consists of β-sheets, usually two sheets in close contact forming a sandwich. Examples are enzymes, transport proteins and antibodies, e.g.: T-cell receptor CD8 (1cd8) (3) A member of class α/β consists mainly of β-sheets with intervening α-helices. This class contains many metabolic enzymes, e.g.: Tryptophan synthase β subunit (2tsy) (4) A member of class α + β consists of segregated α-helices and β-sheets, e.g.: G-specific endonuclease(1rnb)
8 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, (5) This class consists of all multi-domain (α and β) proteins with domains from more than one of the above four classes. (6) Membrane and cell-surface proteins and peptides, e.g.: Integral membrane light-harvesting complex (1kzu) Databases The databases SCOP ( and CATH ( cathdb.info) both contain a hierarchical classification of protein domains by their structures. 37. structures in the PDB 971 folds in SCOP (release 1.71) Number of folds in the six classes in SCOP : Class No of folds α 226 β 149 α/β 134 α + β 286 Multi-domain protein 48 Membrane and cell surface proteins Computing the secondary structure of a known 3D structure Given the positions of the main chain atoms of a protein, the DSSP (definition of secondary structure of proteins) program 1 determines the secondary structure of the protein. (It also computes geometrical features and solvent exposure.) Note that the program does not predict protein secondary structures from sequences, but rather it computes them from coordinates. The DSSP algorithm proceeds as follows: First determine which C=O and NH groups in the main chain are joined by hydrogen bonds. This decision is based on an electrostatic model using the following energy calculation: ( ) 1 E = q 1 q 2 r(on) + 1 r(ch) 1 r(oh) 1 f, r(cn) with q 1 =.42e and q 2 =.2e, where e = esu (electrostatic unit) is the unit electron charge, r(ab) is the inter-atomic distance between atom A in the first amino acid and atom B in the 1 Kabsch and Sander, Dictionary of Protein Secondary Structure: Pattern recognition of Hydrogen-Bonded and Geometrical Fatures. Biopolymers 22, , 1983.
9 86 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 second in Angstroms, f = 332 is a constant called the dimensionality factor, and E is the energy in kcal/mol. Hydrogen bonds have a binding energy of about -3kcal/mol, however DSSP assigns an H-bond between C=O of residue i and NH of residue j if E <.5 kcal/mol. Any H-bond detected in this way is called an k-turn, if it connects the C=O group of amino acid i to the NH group of amino acid i + k, where k = 3, 4 or 5, and a bridge, if it connects residues that are not close to each other in the sequence. Here are some of the patterns that are used to identify secondary structure elements: 3-turn NH - C α - C=O NH - C α - C=O NH - C α - C=O NH - C α - C=O parallel bridge NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O anti-parallel bridge NH - C - C=O NH - C - C=O NH - C - C=O NH - C - C=O C=O - C - NH C=O - C - NH C=O - C - NH C=O - C - NH An α-helix is identified as a consecutive run of (at least two) 4-turns. Any two helices that are offset by two or three residues are concatenated into a single helix. A β-sheet corresponds to a sequence of bridges between consecutive residues in two different regions of the chain. More precisely, we need to introduce two types of patterns: a ladder is a set of one or more consecutive bridges of the same type, and a sheet is one or more ladders connected by shared residues. Detected sheets are then defined to be β-sheets. To allow for irregularities, β-bulges are introduced, in which two perfect ladders or bridges can be connected through a gap of one residue on one side and four on the other. This is how α-helices and β-sheets are defined, detected and annotated in practice. 7.6 Secondary structure prediction from sequences Secondary structure prediction problem: Assume we are given a protein sequence, e.g.: MATVAERCPICLEDPSNYSMALPCL HAFCYVCITRWIRQNPTCPLCKVPV ESVVHTIESDSEFGDQLI The secondary structure prediction problem is to assign a secondary structure type to each amino acid in the sequence, e.g. S (for strand), H (for helical), C (for coils or loops):
10 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, MATVAERCPICLEDPSNYSMALPCL SSSCCCC HAFCYVCITRWIRQNPTCPLCKVPV SSS-CCHHHHHHHH---CCCC---- ESVVHTIESDSEFGDQLI --SS SSP and discriminant-analysis The secondary structure prediction program (SSP) developed by Solovyev and Salamov (1991, 1994) 2 is aimed at getting the location of entire α-helices and β-strands correct rather than assigning each individual residue to the correct type of secondary structure. The SSP algorithm is based on the assumption that secondary structures can be identified by statistical properties associated with an α-helix or β-strand. The SSP algorithm is based on the assumption that secondary structures can be identified by statistical properties of five regions associated with an α-helix or β-strand, namely the N l region, N-terminal, internal, C-terminal and C r regions, respectively, as indicated here: N! helix or " strand N l N internal C C r C The singleton characteristic The singleton characteristic is an average of single-residue preferences. Using a database of known protein structures, for every amino acid a the preference of being in a specific segment of type k (e.g.,an α-helix or a β-strand) is calculated as S k (a) = P k (a) P (a), where P (a) and P k (a) are the proportions of amino acids of type a that are contained in the whole database and in segments of type k, respectively (see P.Y. Chou and G.D. Fasman 1978). Consider a sequence of amino acids A = a 1... a L. Choose start and end positions p and q in the sequence, and a structure type k (e.g, α-helix). The singleton characteristic S k (p, q) is defined as: S k 1 p 1 p+m 1 q m (p, q) = S N l i + Si N + Si internal (q p + 1) + 2m + q i=q m+1 i=p m S C i + q+m i=q+1 i=p S Cr i, i=p+m where S k i := S k (a i ) denotes the preference of amino acid a i to be contained in a segment of type k. Here, m is a pre-chosen parameter that determines the size of the non-internal segments N l, N, C and C r. It usually equals 3 or 4. 2 see second item in literature list of chapter
11 88 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, The doublet characteristic The doublet characteristic is similar to the singlet characteristic. The hope is to obtain a better discrimination by considering pairs of amino acids separated by d =, 1, 2 or 3 other residues. The preference for a particular type of secondary segment k for a pair of amino acids of type a and b, separated by d other residues, is defined as: D k (a, b, d) = P k (a, b, d) P (a, b, d), where P (a, b, d) is the proportion of pairs of amino acids a and b whose positions differ by d in a segment, and P k (a, b, d) is same value restricted to those segments of type k, in the given training database. The average preference of a segment a p a p+1... a q to be in a particular secondary structure k is denoted by D k (p, q, d) and is obtained as the normalized sum of all the pair characteristics occurring in the N l, N, internal, C and C r segments The hydrophobic moment Secondary structure prediction can be aided by examining the periodicity of amino acids with hydrophobic side chains in the protein chain. Tables assigning a hydrophobicity value h(a) (Kyte and Doolittle 1982) to each amino acid a are used to the determine the hydrophobicity of different regions of a protein: 5 4 hydrophobicity 3 2 1!1!2!3!4!5 A R N D C Q E G H I L K M F P S T W Y V amino acid Here, a positive values means hydrophobe, whereas a negative value means hydrophile. Observation: Helices often lie on the surface of a protein and there is a tendency for hydrophobic residues to face the core of the protein and for polar and charged amino acids to face the aqueous environment on the outside of the helix. The hydrophobic moment is calculated for a segment and different angles of rotation per residue (from 18 o ) and measures how well the peptide separates hydrophobic and hydrophilic regions in a pattern that is typical for a helix or strand (Eisenberg et al., 1984). For a given segment a p a p+1... a q of sequence, the hydrophobic moment for an angle ω is defined as: q M ω (p, q) = h(a i ) cos(iω) where h(a) denotes the hydrophobicity of the amino acid a. i=p 2 2 q + h(a i ) sin(iω) Here, hydrophobicity is treated as a vector or a quantity with both a magnitude (positive or negative!) and a direction. The hydrophobic moment is the length of the sum of these individual hydrophobicity vectors. i=p 1 2,
12 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, In the context of predicting α-helices and β-sheets, the angles considered are ω = 1 and ω = 16, respectively. We use ω(k) to denote the angle associated with the structure type k Combining the discriminant functions The SSP method for secondary structure prediction uses a linear combination of all three described discriminant functions (LDF, linear discriminating function): Z k (p, q) = α k 1 S k (p, q) + α k 2 D k (p, q, d) + α k 3 M ω(k) (p, q) Given a threshold c k, this function classifies a segment of sequence a p a p+1... a q into class 1 (i.e., is structure of type k), if Z k (p, q) > c k, or class 2 (i.e., is not structure of type k), if Z k (p, q) c k. For each type of structure k, the method of linear discriminant analysis is used to to determine the coefficients (α k 1, αk 2, αk 3 ) and the threshold constant ck. For a given training set, the goal is to maximize the ratio of the between-class variation of Z k to within-class variation. (We will skip the details (see Fisher, 1936).) The SSP algorithm Given a protein sequence A = a 1 a 2... a L, the SSP algorithm predicts secondary structures in the following way: 1. Determine a seed α-helix consisting of a segment a p a p+1... a q of five residues with an average singleton characteristic higher than a given threshold. 2. Compute the value of Z k (p, q) for k = α-helix. 3. While Z k (p, q) > c k, extend the segment by one residue, up to a maximal extension of 15 residues in each direction. 4. The extended segment that gives rise to the highest LDF score is considered a potential α-helix. A similar seed-and-extend strategy is used to determine potential β-strand segments. Here the length of the initial seed is 3. The result of the two seed-and-extend phases is a set of potential α-helices and β-strands. To obtain a final prediction, overlapping pieces are assigned to the secondary structure types that have the higher LDF value. Non-overlapping remainders of such pieces with lower LDF values are retained as predictions, if they are still long enough. SSP server:
13 9 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, Measuring prediction accuracy The accuracy of computational methods that need to be trained on a database of solved structures is often assessed using either or both of the following two methods: training and test sets and cross-validation. Definition (Leave one out) Assume that we have a training set consisting of n datasets and want to evaluate the performance of some computational method M. In the leave-one-out procedure, for each dataset D repeat the following: Train the method M on all datasets except D ( leave one out ). Run the method M on D. Determine whether the method M produced the correct answer on D. Report the accuracy of the method M as the proportion of correct answers. To evaluate the performance of a secondary structure prediction, one possibility is to assess the level of single-residue accuracy. However, this may be problematic, for example, a clearly wrong prediction such as αβαβα... in an α-helix region will still give rise to a score of 5% correct residue predictions. Thus, in practice one also evaluates the number of correctly predicted α-helices and β-strands, considering a structure to be correctly predicted, if it contains more than a pre-defined number of correctly predicted residues, often just Performance of different characteristics of SSP An experimental evaluation of secondary structure predictions was performed on 126 non-homologous proteins with known three-dimensional structures (Rost and Sander 1993), the secondary structure of which was assigned using the DSSP program. Different combinations of characteristics were compared with each other, giving rise to the following results 3. Characteristics used Q (%) Singleton characteristic (S) 58.5 S + hydrophobic moment M 61.4 Doublet characteristic (D) 62.2 D + M 64.8 S + D + M Neural networks Neural networks are used for classification problems for which there exist a good supply of training data and little understanding of the structure of the problem at hand. They are inspired by the biology of the brain. A neural network is a graph in which nodes represent neurons and edges represent connections between the neurons. Signals flow through the network and are processed by the neurons. Connections can be weak or strong depending on their weight. These weights are usually set by supervised training. 3 T. Jiang, Y. Xu and M.Q. Zhang (editors), 22, page 383
14 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, We destinguish between recurrent architectures that contain directed loops, and feed-forward architectures that do not contain directed loops. An architecture is called layered if elements are grouped in layers and connections between elements are defined through the layers. Input data is presented to an input layer and the output is read from an output layer. Other layers are called hidden: input layer hidden layer output layer In Bioinformatics mostly layered feed-forward neural nets are used. The neuron is the universal basic element of a neural network. One commonly used type is the perceptron: x 2 x w 2 w 1 f(σw i x i ) y x r w r In a feed-forward neural net, a node y is fed from r nodes x 1,..., x r by edges (x i, y) with weights w i. It processes these inputs and fires a signal of strength f(x), where x = r i=1 w ix i. Here is a very simple example of a neural net whose task it is to determine whether x 1 > 1 2 x 2: X1!"#$% Y &$%#$% X 2 (!' It takes two numbers x 1 and x 2 as input and produces a signal y = 2x 1 + ( 1)x 2 as output that is positive, if 2x 1 > x 2, and negative, if 2x 1 < x 2. To mimic the firing of a neuron, we would like the output of the node labeled y to be 1, if 2x 1 > x 2 and, if 2x 1 < x 2. This could be realized using a simple step function { 1 if 2x1 > x y = 2 else. However, it is better to use a continuous function for this purpose, such as a step-like sigmoidal 1 function of the form: f(x) = sgm(x) = 1+exp( x), which looks like this:
15 92 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, #!"+ #,-#./1-!22!"*!")!"(!"'!"&!"%!"$!"#!!#!!'! ' #! Constructing a neural network There are two steps to constructing a neural network. The first step is to design the topology of the network. This involves determining the number of input nodes and output nodes and how they are associated with external variables. Additionally, the number of internal or hidden (layers of) nodes must be determined. Finally, nodes have to be connected using edges. The second step is called training. Supervised training requires a training set consisting of input data points for which the desired output is known. Each such data point is presented to the neural net and then the weights in the net are slightly modified using a gradient descent method so as to increase the performance of the network (as discussed below). The goal is to set the weights of the edges so that the number of correct results produced for a given training data set is maximized. 7.9 The PHD neural network The PHD (PHD-sec) algorithm by Rost and Sander 4 uses a neural network to predict the secondary structure of a given residue. The model consists of three processing units, the input layer, the output layer and a hidden layer. The units of the input layer the amino acids read a small segment (13-17 residues) of sequence around the position of interest, obtained using a sliding window. There are 21 input units per sequence position, namely one per amino acid and one for padding at the beginning and end of the sequence. Given a single sequence, the input unit corresponding to a given amino acid at a given position is set to 1. Then signals are sent to units in the hidden layer, which process them and pass them on to the units of the output layer. The final output determines which of the three types of secondary structure is assigned to the central residue. The PHD paper describes three successive neural networks, PHD-sec for secondary structure prediction, PHD-htm for predicting transmembrane helices and PHD-acc for solvent accessibility : 4 B. Rost and C. Sander, Prediction of protein secondary structure at better than 7% accuracy. J Mol Biol 232, , 1993.
16 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, Here is a simplified depiction of PHD-sec: input sequence input layer window hidden layer output layer L S W T K C Y A V S G A P 1... Hj Ok (Rost, 1996) predicted structure 1 α β coil (Adapted from Mount, 21) If the input to the neural net consists of a sequence profile, then each input unit is set to the frequency of the associated amino acid at the given position. Additionally, two input units are used to count insertions and deletions. The predictions obtained for adjacent windows are then post-processed by applying rules or additional neural nets to obtain a final prediction. Experimental studies show that the PHD method applied to sequences obtains a single-residue accuracy of 7.8%. Application to sequence profiles gives rise to an accuracy of 72% (Rost and Sand 1994). The PHD algorithm uses sequences from the HSSP (homology-derived secondary structure of proteins) database for training (Sander and Schneider, 1991). 7.1 Training the PHD neural network A method called back-propagation can be used to train such neural networks. For example, consider the output node O k shown in the network above and assume that it predicts whether the central residue lies in an α-helix. The output signal O k predicts an α-helix, if it is close to 1, or not, if it is close to.
17 94 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 Presented with a training data point, we know whether or not the central residue actually lies in an α-helix, and thus, what the desired output D k of O k should be. Consider one of the hidden units H j that is connected to O k and emits a signal H j that is modified by the weight w jk. The signal arriving at O k is w jk H j : Hj Hj W jkhj When training the network, the main question is: how should we alter w jk so as to bring the value O k = sgm( j w jkh j ) of node O k closer to the desired value D k? Ok Assume that the network has p input, q hidden and r output nodes. In this case, the output of a hidden node H j is given by: ( p ) H j = sgm w Ii H j I i. The output of an output node O k is given by: q O k = sgm w Hj O k H j. i=1 j=1 Hence, ( q p ) O k = sgm w Hj O k sgm w Ii H j I i, for k = 1, 2,..., r. j=1 i=1 This allows us to calculate the output for a given input set. A training set specifies pairs of inputs and desired outputs, (I 1 1, I1 2,..., I1 p, D 1 1,..., D1 r),..., (I t 1, It 2,..., It p, D t 1,..., Dt r). The mean square error is defined as: E = t q=1 i=1 r (D q i Oq i )2, which is straight-forward to calculate using the previous equation. The gradient descent method specifies that we repeatedly do the following: Choose some weight w ij in the network and modify it by a small amount w ij = n E/ w ij, so as to decrease the error E. The factor n is the training rate (.3). For example, in the case of an edge jk attaching a hidden node H j to an output node O k, the partial derivative of the error E with respect to w jk is given by which will we not show here. E/ w jk = (O k D k )O k (1 O k )H j, So in this case, the weight w jk is modified by this amount: w jk = n(o k D k )O k (1 O k )H j. Secondary structure prediction web server:
8 Protein secondary structure
Grundlagen der Bioinformatik, SoSe 11, D. Huson, June 6, 211 13 8 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze,
More information12 Protein secondary structure
Grundlagen der Bioinformatik, SoSe 14, D. Huson, July 2, 214 147 12 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden &
More informationProtein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods
Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...
More informationIntroduction to Comparative Protein Modeling. Chapter 4 Part I
Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature
More informationProteins: Characteristics and Properties of Amino Acids
SBI4U:Biochemistry Macromolecules Eachaminoacidhasatleastoneamineandoneacidfunctionalgroupasthe nameimplies.thedifferentpropertiesresultfromvariationsinthestructuresof differentrgroups.thergroupisoftenreferredtoastheaminoacidsidechain.
More informationProtein Secondary Structure Prediction
part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino
More informationPhysiochemical Properties of Residues
Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)
More informationProtein Structure Bioinformatics Introduction
1 Swiss Institute of Bioinformatics Protein Structure Bioinformatics Introduction Basel, 27. September 2004 Torsten Schwede Biozentrum - Universität Basel Swiss Institute of Bioinformatics Klingelbergstr
More informationSecondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure
Bioch/BIMS 503 Lecture 2 Structure and Function of Proteins August 28, 2008 Robert Nakamoto rkn3c@virginia.edu 2-0279 Secondary Structure Φ Ψ angles determine protein structure Φ Ψ angles are restricted
More informationProperties of amino acids in proteins
Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated
More informationTranslation. A ribosome, mrna, and trna.
Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno
More informationRead more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov
2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to
More informationPROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS
Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:
More informationPROTEIN STRUCTURE AMINO ACIDS H R. Zwitterion (dipolar ion) CO 2 H. PEPTIDES Formal reactions showing formation of peptide bond by dehydration:
PTEI STUTUE ydrolysis of proteins with aqueous acid or base yields a mixture of free amino acids. Each type of protein yields a characteristic mixture of the ~ 20 amino acids. AMI AIDS Zwitterion (dipolar
More informationUsing Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell
Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell Mathematics and Biochemistry University of Wisconsin - Madison 0 There Are Many Kinds Of Proteins The word protein comes
More informationProtein Structure. Role of (bio)informatics in drug discovery. Bioinformatics
Bioinformatics Protein Structure Principles & Architecture Marjolein Thunnissen Dep. of Biochemistry & Structural Biology Lund University September 2011 Homology, pattern and 3D structure searches need
More informationChemistry Chapter 22
hemistry 2100 hapter 22 Proteins Proteins serve many functions, including the following. 1. Structure: ollagen and keratin are the chief constituents of skin, bone, hair, and nails. 2. atalysts: Virtually
More informationPacking of Secondary Structures
7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary
More informationViewing and Analyzing Proteins, Ligands and their Complexes 2
2 Viewing and Analyzing Proteins, Ligands and their Complexes 2 Overview Viewing the accessible surface Analyzing the properties of proteins containing thousands of atoms is best accomplished by representing
More informationMajor Types of Association of Proteins with Cell Membranes. From Alberts et al
Major Types of Association of Proteins with Cell Membranes From Alberts et al Proteins Are Polymers of Amino Acids Peptide Bond Formation Amino Acid central carbon atom to which are attached amino group
More informationProtein Struktur (optional, flexible)
Protein Struktur (optional, flexible) 22/10/2009 [ 1 ] Andrew Torda, Wintersemester 2009 / 2010, AST nur für Informatiker, Mathematiker,.. 26 kt, 3 ov 2009 Proteins - who cares? 22/10/2009 [ 2 ] Most important
More informationProtein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013
Protein Structure Marianne Øksnes Dalheim, PhD candidate Biopolymers, TBT4135, Autumn 2013 The presentation is based on the presentation by Professor Alexander Dikiy, which is given in the course compedium:
More information114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009
114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 9 Protein tertiary structure Sources for this chapter, which are all recommended reading: D.W. Mount. Bioinformatics: Sequences and Genome
More informationThe Structure of Enzymes!
The Structure of Enzymes Levels of Protein Structure 0 order amino acid composition Primary Secondary Motifs Tertiary Domains Quaternary ther sequence repeating structural patterns defined by torsion angles
More informationThe Structure of Enzymes!
The Structure of Enzymes Levels of Protein Structure 0 order amino acid composition Primary Secondary Motifs Tertiary Domains Quaternary ther sequence repeating structural patterns defined by torsion angles
More information1. Amino Acids and Peptides Structures and Properties
1. Amino Acids and Peptides Structures and Properties Chemical nature of amino acids The!-amino acids in peptides and proteins (excluding proline) consist of a carboxylic acid ( COOH) and an amino ( NH
More informationUNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh
UNT TWELVE PROTENS : PEPTDE BONDNG AND POLYPEPTDES 12 CONCEPTS Many proteins are important in biological structure-for example, the keratin of hair, collagen of skin and leather, and fibroin of silk. Other
More informationSecondary and sidechain structures
Lecture 2 Secondary and sidechain structures James Chou BCMP201 Spring 2008 Images from Petsko & Ringe, Protein Structure and Function. Branden & Tooze, Introduction to Protein Structure. Richardson, J.
More informationNeural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this
More informationLecture 15: Realities of Genome Assembly Protein Sequencing
Lecture 15: Realities of Genome Assembly Protein Sequencing Study Chapter 8.10-8.15 1 Euler s Theorems A graph is balanced if for every vertex the number of incoming edges equals to the number of outgoing
More informationCAP 5510 Lecture 3 Protein Structures
CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity
More informationModel Mélange. Physical Models of Peptides and Proteins
Model Mélange Physical Models of Peptides and Proteins In the Model Mélange activity, you will visit four different stations each featuring a variety of different physical models of peptides or proteins.
More informationProblem Set 1
2006 7.012 Problem Set 1 Due before 5 PM on FRIDAY, September 15, 2006. Turn answers in to the box outside of 68-120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. For each of the following parts, pick
More informationAmino Acids and Peptides
Amino Acids Amino Acids and Peptides Amino acid a compound that contains both an amino group and a carboxyl group α-amino acid an amino acid in which the amino group is on the carbon adjacent to the carboxyl
More informationBasic Principles of Protein Structures
Basic Principles of Protein Structures Proteins Proteins: The Molecule of Life Proteins: Building Blocks Proteins: Secondary Structures Proteins: Tertiary and Quartenary Structure Proteins: Geometry Proteins
More informationProtein Secondary Structure Prediction using Feed-Forward Neural Network
COPYRIGHT 2010 JCIT, ISSN 2078-5828 (PRINT), ISSN 2218-5224 (ONLINE), VOLUME 01, ISSUE 01, MANUSCRIPT CODE: 100713 Protein Secondary Structure Prediction using Feed-Forward Neural Network M. A. Mottalib,
More informationAnalysis and Prediction of Protein Structure (I)
Analysis and Prediction of Protein Structure (I) Jianlin Cheng, PhD School of Electrical Engineering and Computer Science University of Central Florida 2006 Free for academic use. Copyright @ Jianlin Cheng
More informationProtein Struktur. Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep. wake up at slide 39
Protein Struktur Biologen und Chemiker dürfen mit Handys spielen (leise) go home, go to sleep wake up at slide 39 Andrew Torda, Wintersemester 2016/ 2017 Andrew Torda 17.10.2016 [ 1 ] Proteins - who cares?
More informationSupersecondary Structures (structural motifs)
Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric
More informationPeptides And Proteins
Kevin Burgess, May 3, 2017 1 Peptides And Proteins from chapter(s) in the recommended text A. Introduction B. omenclature And Conventions by amide bonds. on the left, right. 2 -terminal C-terminal triglycine
More informationProtein Structures: Experiments and Modeling. Patrice Koehl
Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:
More informationExam I Answer Key: Summer 2006, Semester C
1. Which of the following tripeptides would migrate most rapidly towards the negative electrode if electrophoresis is carried out at ph 3.0? a. gly-gly-gly b. glu-glu-asp c. lys-glu-lys d. val-asn-lys
More informationBiochemistry Quiz Review 1I. 1. Of the 20 standard amino acids, only is not optically active. The reason is that its side chain.
Biochemistry Quiz Review 1I A general note: Short answer questions are just that, short. Writing a paragraph filled with every term you can remember from class won t improve your answer just answer clearly,
More informationStatistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics
Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia
More informationPROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES
PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES by Lipontseng Cecilia Tsilo A thesis submitted to Rhodes University in partial fulfillment of the requirements for
More informationExam III. Please read through each question carefully, and make sure you provide all of the requested information.
09-107 onors Chemistry ame Exam III Please read through each question carefully, and make sure you provide all of the requested information. 1. A series of octahedral metal compounds are made from 1 mol
More informationSection Week 3. Junaid Malek, M.D.
Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,
More informationBIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I
BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 9. Protein Structure Prediction I Structure Prediction Overview Overview of problem variants Secondary structure prediction
More informationProtein Secondary Structure Prediction
Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms
More informationSolutions In each case, the chirality center has the R configuration
CAPTER 25 669 Solutions 25.1. In each case, the chirality center has the R configuration. C C 2 2 C 3 C(C 3 ) 2 D-Alanine D-Valine 25.2. 2 2 S 2 d) 2 25.3. Pro,, Trp, Tyr, and is, Trp, Tyr, and is Arg,
More informationCentral Dogma. modifications genome transcriptome proteome
entral Dogma DA ma protein post-translational modifications genome transcriptome proteome 83 ierarchy of Protein Structure 20 Amino Acids There are 20 n possible sequences for a protein of n residues!
More informationBioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter
Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction Institute of Bioinformatics Johannes Kepler University, Linz, Austria Chapter 4 Protein Secondary
More informationBasics of protein structure
Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu
More informationOn the Structure Differences of Short Fragments and Amino Acids in Proteins with and without Disulfide Bonds
On the Structure Differences of Short Fragments and Amino Acids in Proteins with and without Disulfide Bonds A thesis submitted for the degree of Doctor of Philosophy Saravanan Dayalan B.E., M.App.Sc(IT),
More informationDATA MINING OF ELECTROSTATIC INTERACTIONS BETWEEN AMINO ACIDS IN COILED-COIL PROTEINS USING THE STABLE COIL ALGORITHM ANKUR S.
University of Colorado at Colorado Springs i DATA MINING OF ELECTROSTATIC INTERACTIONS BETWEEN AMINO ACIDS IN COILED-COIL PROTEINS USING THE STABLE COIL ALGORITHM BY ANKUR S. DESHMUKH A project submitted
More informationCHAPTER 29 HW: AMINO ACIDS + PROTEINS
CAPTER 29 W: AMI ACIDS + PRTEIS For all problems, consult the table of 20 Amino Acids provided in lecture if an amino acid structure is needed; these will be given on exams. Use natural amino acids (L)
More information12/6/12. Dr. Sanjeeva Srivastava IIT Bombay. Primary Structure. Secondary Structure. Tertiary Structure. Quaternary Structure.
Dr. anjeeva rivastava Primary tructure econdary tructure Tertiary tructure Quaternary tructure Amino acid residues α Helix Polypeptide chain Assembled subunits 2 1 Amino acid sequence determines 3-D structure
More informationProtein Structure & Motifs
& Motifs Biochemistry 201 Molecular Biology January 12, 2000 Doug Brutlag Introduction Proteins are more flexible than nucleic acids in structure because of both the larger number of types of residues
More informationBCH 4053 Exam I Review Spring 2017
BCH 4053 SI - Spring 2017 Reed BCH 4053 Exam I Review Spring 2017 Chapter 1 1. Calculate G for the reaction A + A P + Q. Assume the following equilibrium concentrations: [A] = 20mM, [Q] = [P] = 40fM. Assume
More informationNumber sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence
Number sequence representation of protein structures based on the second derivative of a folded tetrahedron sequence Naoto Morikawa (nmorika@genocript.com) October 7, 2006. Abstract A protein is a sequence
More informationALL LECTURES IN SB Introduction
1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL
More informationLS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor
LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor Note: Adequate space is given for each answer. Questions that require a brief explanation should
More informationBioinformatics. Macromolecular structure
Bioinformatics Macromolecular structure Contents Determination of protein structure Structure databases Secondary structure elements (SSE) Tertiary structure Structure analysis Structure alignment Domain
More information1. What is an ångstrom unit, and why is it used to describe molecular structures?
1. What is an ångstrom unit, and why is it used to describe molecular structures? The ångstrom unit is a unit of distance suitable for measuring atomic scale objects. 1 ångstrom (Å) = 1 10-10 m. The diameter
More informationHeteropolymer. Mostly in regular secondary structure
Heteropolymer - + + - Mostly in regular secondary structure 1 2 3 4 C >N trace how you go around the helix C >N C2 >N6 C1 >N5 What s the pattern? Ci>Ni+? 5 6 move around not quite 120 "#$%&'!()*(+2!3/'!4#5'!1/,#64!#6!,6!
More informationProcheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.
Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics Iosif Vaisman Email: ivaisman@gmu.edu ----------------------------------------------------------------- Bond
More informationOverview. The peptide bond. Page 1
Overview Secondary structure: the conformation of the peptide backbone The peptide bond, steric implications Steric hindrance and sterically allowed conformations. Ramachandran diagrams Side chain conformations
More informationB O C 4 H 2 O O. NOTE: The reaction proceeds with a carbonium ion stabilized on the C 1 of sugar A.
hbcse 33 rd International Page 101 hemistry lympiad Preparatory 05/02/01 Problems d. In the hydrolysis of the glycosidic bond, the glycosidic bridge oxygen goes with 4 of the sugar B. n cleavage, 18 from
More informationTHE UNIVERSITY OF MANITOBA. PAPER NO: _1_ LOCATION: 173 Robert Schultz Theatre PAGE NO: 1 of 5 DEPARTMENT & COURSE NO: CHEM / MBIO 2770 TIME: 1 HOUR
THE UNIVERSITY OF MANITOBA 1 November 1, 2016 Mid-Term EXAMINATION PAPER NO: _1_ LOCATION: 173 Robert Schultz Theatre PAGE NO: 1 of 5 DEPARTMENT & COURSE NO: CHEM / MBIO 2770 TIME: 1 HOUR EXAMINATION:
More informationGiri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748
CAP 5510: Introduction to Bioinformatics Giri Narasimhan ECS 254; Phone: x3748 giri@cis.fiu.edu www.cis.fiu.edu/~giri/teach/bioinfs07.html 2/15/07 CAP5510 1 EM Algorithm Goal: Find θ, Z that maximize Pr
More informationThe Structure and Functions of Proteins
Wright State University CORE Scholar Computer Science and Engineering Faculty Publications Computer Science and Engineering 2003 The Structure and Functions of Proteins Dan E. Krane Wright State University
More informationEXAM 1 Fall 2009 BCHS3304, SECTION # 21734, GENERAL BIOCHEMISTRY I Dr. Glen B Legge
EXAM 1 Fall 2009 BCHS3304, SECTION # 21734, GENERAL BIOCHEMISTRY I 2009 Dr. Glen B Legge This is a Scantron exam. All answers should be transferred to the Scantron sheet using a #2 pencil. Write and bubble
More informationSEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS. Prokaryotes and Eukaryotes. DNA and RNA
SEQUENCE ALIGNMENT BACKGROUND: BIOINFORMATICS 1 Prokaryotes and Eukaryotes 2 DNA and RNA 3 4 Double helix structure Codons Codons are triplets of bases from the RNA sequence. Each triplet defines an amino-acid.
More informationTamer Barakat. Razi Kittaneh. Mohammed Bio. Diala Abu-Hassan
14 Tamer Barakat Razi Kittaneh Mohammed Bio Diala Abu-Hassan Protein structure: We already know that when two amino acids bind, a dipeptide is formed which is considered to be an oligopeptide. When more
More informationCan protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU
Can protein model accuracy be identified? Morten Nielsen, CBS, BioCentrum, DTU NO! Identification of Protein-model accuracy Why is it important? What is accuracy RMSD, fraction correct, Protein model correctness/quality
More informationLecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction.
C N T R F O R N T G R A T V B O N F O R M A T C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 7 Protein Secondary Structure Prediction Protein primary structure 20 amino
More informationProtein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror
Protein structure (and biomolecular structure more generally) CS/CME/BioE/Biophys/BMI 279 Sept. 28 and Oct. 3, 2017 Ron Dror Please interrupt if you have questions, and especially if you re confused! Assignment
More informationD Dobbs ISU - BCB 444/544X 1
11/7/05 Protein Structure: Classification, Databases, Visualization Announcements BCB 544 Projects - Important Dates: Nov 2 Wed noon - Project proposals due to David/Drena Nov 4 Fri PM - Approvals/responses
More informationStudies Leading to the Development of a Highly Selective. Colorimetric and Fluorescent Chemosensor for Lysine
Supporting Information for Studies Leading to the Development of a Highly Selective Colorimetric and Fluorescent Chemosensor for Lysine Ying Zhou, a Jiyeon Won, c Jin Yong Lee, c * and Juyoung Yoon a,
More informationBiomolecules: lecture 9
Biomolecules: lecture 9 - understanding further why amino acids are the building block for proteins - understanding the chemical properties amino acids bring to proteins - realizing that many proteins
More informationProgramme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues
Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback
More informationIT og Sundhed 2010/11
IT og Sundhed 2010/11 Sequence based predictors. Secondary structure and surface accessibility Bent Petersen 13 January 2011 1 NetSurfP Real Value Solvent Accessibility predictions with amino acid associated
More informationSupplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution
Supplemental Materials for Structural Diversity of Protein Segments Follows a Power-law Distribution Yoshito SAWADA and Shinya HONDA* National Institute of Advanced Industrial Science and Technology (AIST),
More informationHMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder
HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding
More informationResonance assignments in proteins. Christina Redfield
Resonance assignments in proteins Christina Redfield 1. Introduction The assignment of resonances in the complex NMR spectrum of a protein is the first step in any study of protein structure, function
More informationCHEM J-9 June 2014
CEM1611 2014-J-9 June 2014 Alanine (ala) and lysine (lys) are two amino acids with the structures given below as Fischer projections. The pk a values of the conjugate acid forms of the different functional
More informationDental Biochemistry Exam The total number of unique tripeptides that can be produced using all of the common 20 amino acids is
Exam Questions for Dental Biochemistry Monday August 27, 2007 E.J. Miller 1. The compound shown below is CH 3 -CH 2 OH A. acetoacetate B. acetic acid C. acetaldehyde D. produced by reduction of acetaldehyde
More informationTHE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION
THE TANGO ALGORITHM: SECONDARY STRUCTURE PROPENSITIES, STATISTICAL MECHANICS APPROXIMATION AND CALIBRATION Calculation of turn and beta intrinsic propensities. A statistical analysis of a protein structure
More informationFrom Amino Acids to Proteins - in 4 Easy Steps
From Amino Acids to Proteins - in 4 Easy Steps Although protein structure appears to be overwhelmingly complex, you can provide your students with a basic understanding of how proteins fold by focusing
More informationConformational Geometry of Peptides and Proteins:
Conformational Geometry of Peptides and Proteins: Before discussing secondary structure, it is important to appreciate the conformational plasticity of proteins. Each residue in a polypeptide has three
More informationBioinformatics: Secondary Structure Prediction
Bioinformatics: Secondary Structure Prediction Prof. David Jones d.jones@cs.ucl.ac.uk LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem? Entries
More information7.012 Problem Set 1 Solutions
ame TA Section 7.012 Problem Set 1 Solutions Your answers to this problem set must be inserted into the large wooden box on wheels outside 68120 by 4:30 PM, Thursday, September 15. Problem sets will not
More informationStudy of Mining Protein Structural Properties and its Application
Study of Mining Protein Structural Properties and its Application A Dissertation Proposal Presented to the Department of Computer Science and Information Engineering College of Electrical Engineering and
More informationDATE A DAtabase of TIM Barrel Enzymes
DATE A DAtabase of TIM Barrel Enzymes 2 2.1 Introduction.. 2.2 Objective and salient features of the database 2.2.1 Choice of the dataset.. 2.3 Statistical information on the database.. 2.4 Features....
More informationStructural Alignment of Proteins
Goal Align protein structures Structural Alignment of Proteins 1 2 3 4 5 6 7 8 9 10 11 12 13 14 PHE ASP ILE CYS ARG LEU PRO GLY SER ALA GLU ALA VAL CYS PHE ASN VAL CYS ARG THR PRO --- --- --- GLU ALA ILE
More informationChemical Properties of Amino Acids
hemical Properties of Amino Acids Protein Function Make up about 15% of the cell and have many functions in the cell 1. atalysis: enzymes 2. Structure: muscle proteins 3. Movement: myosin, actin 4. Defense:
More informationProtein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche
Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche The molecular structure of a protein can be broken down hierarchically. The primary structure of a protein is simply its
More informationProtein Structure: Data Bases and Classification Ingo Ruczinski
Protein Structure: Data Bases and Classification Ingo Ruczinski Department of Biostatistics, Johns Hopkins University Reference Bourne and Weissig Structural Bioinformatics Wiley, 2003 More References
More informationProtein structure alignments
Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives
More informationProtein Structure Basics
Protein Structure Basics Presented by Alison Fraser, Christine Lee, Pradhuman Jhala, Corban Rivera Importance of Proteins Muscle structure depends on protein-protein interactions Transport across membranes
More information