Predictors (of secondary structure) based on Machine Learning tools

Size: px
Start display at page:

Download "Predictors (of secondary structure) based on Machine Learning tools"

Transcription

1 Predictors (of secondary structure) based on Machine Learning tools

2 Predictors of secondary structure 1 Generation methods: propensity of each residue to be in a given conformation Chou-Fasman 2 Generation methods: context dependence is introduced GOR Neural Networks can be adopted

3 Tools out of machine learning approaches Neural Networks can learn the mapping from sequence to secondary structure Training Data Base Subset Prediction New sequence TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN General rules EEEE..HHHHHHHHHHHH...HHHHHHHH.EEEE Known mapping Prediction

4 Neural network for secondary structure prediction Outputs encode for the structure of the central residue of the input window a b C Output Hidden neurons :4-15 Input Tipical input window: residues M P I L K QK P I H Y H P N H G E A K G How would you encode the input? (numerical values are needed)

5 Neural network for secondary structure prediction Output a b C Input 2- valued vectors for each position prevent from introducing spurious similarities among residues M P I L K QK P I H Y H P N H G E A K G A C D E F G H I 1 K 1 L M N 1 P 1 1 Q R S T V W Y 1

6 A C D E F G H I K L M N P Q R S T V W Y. D (L) R (E) Q (E) G (E) F (E) V (E) P (E) A (H) A (H) Y (H) V (E) K (E) K (E) H E L The actual number of input neurons is 2 x window length

7 Training and testing of a predictor on 822 proteins from the PDB The cross validation procedure Protein set Training set 1 Testing set 1

8 Efficiency of the Neural Network-Based Predictors on 822 Proteins (in Testing Phase) INPUT Single Sequence Accuracy (%) 66.3 Sens[H].69 Sens[E].61 Sens[C].66 PPV[H].7 PPV[E].54 PPV[C].71 MCC[H].54 MCC[E].44 MCC[C].45 On the same set GOR performs at 64% Accuracy Why do NNs (slightly) outperform GOR, despite both uses the same input information?

9 GOR Simplification (1) : only local sequences (window size = 17) are considered I ( S ; R) I( S ; R,, R,, R ) j i j 8 j j 8 Simplification (2) : each residue position is statistically independent 8 NNs I( S ; R,, R,, R ) I( S ; R ) i j 8 j j 8 j j m m 8 The hidden layer performs a NON LINEAR mapping of the input in a new representation Non linearity can better model correlations among the positions in the input window =

10 Neural network for secondary structure prediction Outputs encode for the structure of the central residue of the input window a b C Output Hidden neurons :4-15 Input Tipical input window: residues M P I L K QK P I H Y H P N H G E A K G Can we add more information in the input encoding?

11 Third generation methods: evolutionary information 1 Y K D Y H S - D K K K G E L Y R D Y Q T - D Q K K G D L Y R D Y Q S - D H K K G E L Y R D Y V S - D H K K G E L Y R D Y Q F - D Q K K G S L Y K D Y N T - H Q K K N E S Y R D Y Q T - D H K K A D L G Y G F G - - L I K N T E T T K 9 T K G Y G F G L I K N T E T T K 1 T K G Y G F G L I K N T E T T K Position A 1 C D E 7 F 1 33 G H K I 3 L 3 M 6 N P Q 4 3 R 5 S T V 1 W 1 Y 7 9

12 One problem: Output a b C Each prediction considers a neighborhood of the sequence. Input Output M P I L K Q K P I H Y H P N H G E A K G M P I L K Q K P I H Y H P N H G E A K G M P I L K Q K P I H Y H P N H G E A K G M P I L K Q K P I H Y H P N H G E A K G M P I L K Q K P I H Y H P N H G E A K G C b a a b However the prediction on each window is independent of the predictions in neighbouring windows Predictions are uncorrelated and, sometimes inconsistent. How can we add correlations among neighbouring predictions?

13 The Network Architecture for Secondary Structure Prediction The First Network (Sequence to Structure) H E C SeqNo No V L I M F W Y G A P S T C H R K Q E N D

14 The Network Architecture for Secondary Structure Prediction The Second Network (Structure to Structure) H E C CCHHEHHHHCHHCCEECCEEEEHHHCC SeqNo No V L I M F W Y G A P S T C H R K Q E N D

15 Efficiency of the Neural Network-Based Predictors on 822 Proteins (in Testing Phase) INPUT Single Sequence Accuracy (%) 66.3 Sens[H].69 Sens[E].61 Sens[C].66 PPV[H].7 PPV[E].54 PPV[C].71 MCC[H].54 MCC[E].44 MCC[C].45 INPUT Multiple Sequence (PSI_BLAST) Accuracy (%) 73.4 Sens[H].75 Sens[E].7 Sens[C].73 PPV[H].8 PPV[E].63 PPV[C].75 MCC[H].67 MCC[E].56 MCC[C].53

16 Secondary Structure Prediction From sequence TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN To secondary structure EEEE..HHHHHHHHHHHH...HHHHHHHH.EEEE... And to the reliability of the prediction Usually computed as INT(1*difference of the two highest output value)

17 Chamaleon sequences QEALEIA GIKSKQEALEIAARRN Translation Initiation Factor 3 Bacillus stearothermophilus FNPQTQEALEIAPSVGV Transcription Factor 1 Bacteriophage Spo1 1WTUA 1TIF

18 We extract: from a set of 822 non-homologous proteins (174,192 residues) 2,452 5-mer chameleons 17 6-mer chameleons 16 7-mer chameleons 1 8-mer chameleon 2,576 couples The total number of residues in chameleons is 26,44 out of 755 protein chains (~15%)

19 Prediction of the Secondary Structure of Chameleon sequences with Neural Networks QEALEIA HHHHHHH a b C QEALEIA CCCCCCC a b C NGDQLGIKSKQEALEIAARRNLDLVLVAP ARKGFNPQTQEALEIAPSVGVSVKPG

20 The Prediction of Chameleons with Neural Networks Method Performance on the Protein data set Performance on Chameleon sequences NN with MSA Input 73.4 % 75.1 % NN with SS Input 66.3 % 58.9 % GOR IV 64.4% 55.2 % Jacoboni I, Martelli PL, Fariselli P, Compiani M, Casadio R. Predictions of protein segments with the same aminoacid sequence and different secondary structure: a benchmark for predictive methods. Proteins. 2

21 Other neural network-based predictors Secondary structure Topology of transmebrane proteins Cysteine bonding state Contact maps of proteins Interaction sites on protein surface.

22 Prediction of membrane protein topology

23 SEVERAL TYPES OF INTERACTION BETWEEN PROTEINS AND MEMBRANES Anchors Integral membrane proteins Peripheral membrane protein Geoffrey M. Cooper The Cell

24 FUNCTIONS OF MEMBRANE PROTEINS Transport Intercellular joining Enzymatic activity Intercellular recognition Signal transduction Attachment to cytoskeleton and extracellular matrix

25 Porin (Rhodobacter capsulatus) Bacteriorhodopsin (Halobacterium salinarum) INTEGRAL MEMBRANE PROTEINS b-barrel a-helices

26 TOPOLOGY OF INTEGRAL MEMBRANE PROTEINS Topography position of Trans Membrane Segments along the sequence Out + Bilayer N In Topology C position of N and C termini with respect to the bilayer +

27 Searching for the most suitable features

28 Starting points 1) Different portions of the protein have different amino-acid composition Frequency (%) Loops Transmembrane alpha-helices Transmembrane beta strands A I L V G M P F W Y N C Q S T H R K E D apolar Aminoacids polar charged

29 Starting points 1) Different portions of the protein have different amino-acid composition Von Heijne s positive inside rule

30 Starting points 1) Different portions of the protein have different amino-acid composition 2) Transmembrane segments and loops are organised following a rigid grammar Outer Side Transmembrane Inner Side

31 Starting points 1) Different portions of the protein have different amino-acid composition 2) Transmembrane segments and loops are organised following a rigid grammar 1 Frequency (%) Length of transmembrane alpha-helices (residues)

32 Starting points 1) Different portions of the protein have different amino-acid composition 2) Transmembrane segments and loops are organised following a rigid grammar 2 Frequency (%) Length of transmembrane beta strands (residues)

33 Starting points 1) Different portions of the protein have different amino-acid composition 2) Transmembrane segments and loops are organised following a rigid grammar 3) Evolutionary information can increase the prediction performances

34 Differences in composition: propensity scales

35 First generation methods: Single residue statistics Propensity scales For each residue The association between each residue and the different features is statistically evaluated Physical and chemical features of residues A propensity value for any structure can be associated to any residue HOW?

36 Transmembrane alpha-helices: Kyte-Doolittle scale It is computed taking into consideration the octanol-water partition coefficient, combined with the propensity of the residues to be found in known transmembrane helices Ala: 1.8 Arg: -4.5 Asn: -3.5 Asp: -3.5 Cys: 2.5 Gln: -3.5 Glu: -3.5 Gly: -.4 His: -3.2 Ile: 4.5 Leu: 3.8 Lys: -3.9 Met: 1.9 Phe: 2.8 Pro: -1.6 Ser: -.8 Thr: -.7 Trp: -.9 Tyr: -1.3 Val: 4.2

37 More than 2 different hydrophobicity scales in ProtScale, and many many others

38 The Kyte-Doolittle scale The scale is based on an amalgam of experimental observations derived from the literature... In the case of membrane-bound proteins, the portions of their sequences that are located within the lipid bilayer are also clearly delineated by large uninterrupted areas on the hydrophobic side of the midpoint line. As such, the membrane-spanning segments of these proteins can be identified by this procedure. Kyte R, Doolittle RF, JMB 157:15-132, 1982 Ala: 1.8 Arg: -4.5 Asn: -3.5 Asp: -3.5 Cys: 2.5 Gln: -3.5 Glu: -3.5 Gly: -.4 His: -3.2 Ile: 4.5 Leu: 3.8 Lys: -3.9 Met: 1.9 Phe: 2.8 Pro: -1.6 Ser: -.8 Thr: -.7 Trp: -.9 Tyr: -1.3 Val: 4.2

39 4 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A KD Average

40 4 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A KD Average ObservedTM helices

41 An hydropathy scale based on thermodynamic principles The Wimley-White scale White SH, Wimley WC, Annu Rev Biophys Biomol Struct 28: (1999)

42 An hydropathy scale based on thermodynamic principles The Wimley-White scales

43 SecY Translocon, Methanococcus jannaschii PDB: 1RHZ:A

44 The Wimley-White scale correlates with free energies of insertion as measured in vivo AAAAAAAAAXAAAAAAAAA Hessa T et al., Nature 433: (25)

45 Length constraints: filtering procedures

46 Algorithms specifically designed to optimise the number and length of segments in a given protein sequence to be compatible with the membrane spanning regions. MaxSubSeq, given a general propensity plot, finds the maximum-scoring subsequences with: constrained segment length 15-4 for all-a membrane proteins 6-25 for b-barrel membrane proteins constrained segment number even number of b-strands in b-barrel membrane proteins Fariselli et al. (23) Bioinformatics 19:5-55

47 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A

48 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A

49 MEMSAT 1

50 A more sofisticated model of membrane protein Five structural states are defined: Helix inner end(h i ) Inner loop (L i ) Helix middle (H m ) Outer loop (L o ) Helix outer end (H o ) Jones DT et al, Biochemistry 33: (1994)

51 Propensity scales are computed for each state Starting from the proteins whose topology is known (either from 3-D coordinates or low-resolution experiments), the propensity parameters Prop(aa,state)=-Log[p(aa,state)/p(aa)p(state)] are computed. Jones DT et al, Biochemistry 33: (1994)

52 Jones DT et al, Biochemistry 33: (1994)

53 Strategies for finding better propensities: 1) Considering the sequence context

54 Neural network for secondary structure prediction Output TM nottm Non linear mapping Input sually: nput residues idden neurons :4-15 M P I L K QK P I H Y H P N H G E A K G A C D E F G H I 1 K 1 L M N 1 P 1 1 Q R S T V W Y 1

55 Strategies for finding better propensities: 2) Introducing evolutionary information

56 MSA Sequence profile Evolutionary information and sequence profiles 1 Y K D Y H S - D K K K G E L Y R D Y Q T - D Q K K G D L Y R D Y Q S - D H K K G E L Y R D Y V S - D H K K G E L Y R D Y Q F - D Q K K G S L Y K D Y N T - H Q K K N E S Y R D Y Q T - D H K K A D L G Y G F G - - L I K N T E T T K 9 T K G Y G F G L I K N T E T T K 1 T K G Y G F G L I K N T E T T K sequence position A 1 C D E 7 F 1 33 G H K I 3 L 3 M 6 N P Q 4 3 R 5 S T V 1 W 1 Y 7 9

57 A C L P R P... t Sequence of characters s t KD (s t ) Sequence of 2-dimensional vectors 9 1 n Ex: Propensity scales weighted on sequence profiles v t S v t (i) KD (i) i=1 2

58 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A KD+MaxSubSeq: psi-kd+maxsubseq: 72% correct topography 83% correct topography

59 Propensities can be computed with a NN or a SVM

60 Artificial Neural Networks Single Layer Perceptron d y 1 Outputs y m a j = S w ji x i i = y j = g (a j ) The Error Function Bias x x 1 Inputs x d Y i (X q ) = Output of the network D iq = Expected Value The Back Propagation Training Algorithm (gradient descent: Rumelhart et al. 1986) Correction to the weights m = learning rate h = momentum term

61 Neural Network for the prediction of TMS in membrane proteins TM nontm A V L I A W M L C.. Local information

62 Dynamic programming filtering procedure TM activation Predicted TM Segment TM activation Sequence Maximum-scoring subsequences with constrained segment length and number

63

64

65 Modelling the grammar: Hidden Markov Models

66 A generic model for membrane proteins End Outer Side Transmembrane Inner Side Begin

67 Model of a-helix membrane proteins (HMM1) Outer Side Transmembrane Inner Side

68 Model of a-helix membrane proteins (HMM1) Outer Side Transmembrane Inner Side

69 Hidden Markov Models Generation of the sequence: path through the states. Each state emits a residue of the sequence with a peculiar probability distribution. Given a sequence and a trained model we can compute the probability of the sequence the decoding of the sequence (which state emits a given residue?)

70 Decoding Viterbi decoding: given a sequence, the Viterbi algorithm finds the optimal path through the states A posteriori decoding: given a sequence, for each position i and for each state k, we compute P(state(i) = k sequence) then we sum over all the k corresponding to transmembrane states.

71 Dynamic programming filtering procedure TM probability Predicted TM Segment TM probability Sequence Maximum-scoring subsequences with constrained segment length and number

72

73 Sequence profile based HMMs

74 Sequence-profile-based HMM A C L P R P E T... t Sequence of characters s t Sequence of 2-dimensional vectors v t v t (n) 1 t, n S k=1 v t (k) = 1 t M 9 1 n Constraints

75 Sequence-profile-based HMM Martelli PL, Fariselli P, Krogh A, Casadio R. A sequence-profile-based HMM for predicting and discriminating beta barrel membrane proteins. Bioinformatics. 22;18 Suppl 1:S Probability of emission from state k Sequence of characters s t P(s t k) = e k (s t ) Sequence of M-dimensional vectors v t P(v t k) = 1 Z S M n=1 v t (n) e k (n) constraints P(v t k) d M v t = 1 Z = S A A! M S n=1 e k (n) M If S n=1 e k (n) = 1 Z is independent of the state Algorithms for training and probability computation can be derived

76 ENSEMBLE system for predicting the topology of all alpha membrane proteins

77 A new Bologna Predictor trained on high resolved TM a-helices of Membrane Proteins From the PDB: 59 Transmembrane Chains 229 Transmembrane a-helices Bacterial rhodopsin, G-protein coupled receptors, Light harvesting complexes, Photosystems, Potassium channels, Mechanosensitive channel, Chloride channel, Other channels, Glycophorin, ATP Binding Cassette transporters, Ca ATPase, Fumarate reductase, FoATP synthase, Cytochrome C Oxidase, Cytochrome bc1 complexes, Formate dehydrogenase Martelli PL, Fariselli P, Casadio R. An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins. Bioinformatics. 23;19 Suppl 1:i25-11.

78 Model of a-helix membrane proteins (HMM1) Outer Side Transmembrane Inner Side

79 Model of a-helix membrane proteins (HMM2) Outer Side Transmembrane Inner Side

80 Neural Network for the prediction of TMS in membrane proteins TM nontm nnttttttttttttttttnnnnnnnttttttttnnnnttt SeqNo No V L I M F W Y G A P S T C H R K Q E N D

81 Sequence Sequence Profiles NN HMM1 HMM2 S Jury MaxSubSeq Topography HMM decoding Topology Prediction

82 ENSEMBLE corrects the errors of the single methods Cytocrhome Bc1 Bos taurus PDB: 1BGY:C ENSEMBLE prediction HMM2 prediction HMM1 prediction NN prediction Observed TMHs Sequence (1bgyC)

83 Performance on the 71 high-resolved proteins: Q topography Q topology NN 58/71 (82%) 44/71 (62%) HMM1 58/71 (82%) 47/71 (66%) HMM2 47/71 (8%) 5/71 (7%) ENSEMBLE 63/59 (89%) 54/71 (76 %)

84 Performances

85 SecY Translocon Methanococcus jannaschii PDB: 1RHZ:A TMHMM KD PSI KD PRODIV HMM PHDhtm MEMSAT ENSEMBLE Observed

86 Calcium ATPase Rabbit PDB: 1EUL TMHMM KD PSI KD PRODIV HMM PHDhtm MEMSAT ENSEMBLE Observed

87 Clc Chloride Channel Salmonella typhimurium PDB: 1KPL TMHMM KD PSI KD PRODIV HMM PHDhtm MEMSAT ENSEMBLE Observed

88 Performance on the 71 high-resolved proteins: Q topography Q topology NN 58/71 (82%) 44/71 (62%) HMM1 58/71 (82%) 47/71 (66%) HMM2 47/71 (8%) 5/71 (7%) ENSEMBLE 63/59 (89%) 54/71 (76 %) Q topography Q topology PHD 5/71 (7%) 35/71 (49%) HMMTOP 53/71 (75%) 46/71 (65%) TMHMM 49/71 (69%) 38/71 (54%) MEMSAT 44/71 (73%) 44/6 (73%) PRODIV-HMM 6/71 (85%) 56/71 (78%) KD 51/71 (72%) -- PSI-KD 59/71 (83%) --

89 Rate of classification errors (%) Discriminating all-a membrane proteins Globular Membrane ,7,72,74,76,78,8,82,84,86,88,9,92,94,96,98 Height of the maximum TM peak

90 Prediction of the cysteine bonding state Tryparedoxin-I from Crithidia fasciculata (1QK8) MSGLDKYLPGIEKLRRGDGEVEVKSLAGKLVFFYFSASWCPPCRGFTPQLIEFYDKFHES KNFEVVFCTWDEEEDGFAGYFAKMPWLAVPFAQSEAVQKLSKHFNVESIPTLIGVDADSG DVVTTRARATLVKDPEGEQFPWKDAP Free cysteines Cys68 Disulphide bonded cysteines Cys4 Cys43

91 Perceptron (input: sequence profile) bonded Non bonded NGDQLGIKSKQEALCIAARRNLDLVLVAP

92 Position Position Plotting the trained weigths Hinton s plot bonding state non bonding state Residue Residue Residue V L I M F W Y G A P S T C H R K Q E N D & # V L I M F W Y G A P S T C H R K Q E N D & #

93 It is possible to model a sintax (bonded cysteines must be in even number)? Begin 1 2 Free states Bonded states 3 4 End

94 A path 1 3 Begin 2 4 Bonding Residue State State C4 C43 C68 End

95 A path 1 3 Begin 2 4 Bonding Residue State State C4 1 F C43 C68 P(seq) = P(1 Begin) P(C4 1)... End

96 A path Begin 1 3 End 2 4 Bonding Residue State State C4 1 F C43 2 B C68 P(seq) = P(1 Begin) P(C4 1)... P(2 1) P(C43 2)..

97 A path 1 Begin 2 Bonding Residue State State C4 1 F C43 2 B C68 4 B 3 End 4 P(seq) = P(1 Begin) P(C4 1)... P(2 1) P(C43 2).. P(4 2) P(C68 4)..

98 A path 1 Begin 2 Bonding Residue State State C4 1 F C43 2 B C68 4 B 3 End 4 P(seq) = P(1 Begin) P(C4 1)... P(2 1) P(C43 2).. P(4 2) P(C68 4).. P(End 4)

99 4 possible paths Begi n Bonding Residue State State C4 1 F C43 2 B C68 4 B Begi n Bonding Residue State State C4 2 B C43 3 F C68 4 B End End Begi n Bonding Residue State State C4 1 F C43 1 F C68 1 F Begi n Bonding Residue State State C4 2 B C43 4 B C68 1 F End End

100 Hybrid system W 1 W 2 W 3 MYSFPNSFRFGWSQAGFQCEMSTPGSEDPNTDWYKWVHDPENMAAGLCSGDLPENGPGYWGNYKTFHDNAQKMCLKIARLNVEWSRIFPNP... P(B W 1 ), P(F W 1 ) P(B W 2 ), P(F W 2 ) P(B W 3 ), P(F W 3 ) Begi n Free Cys Bonded Cys End Viterbi path Prediction of bonding state of cysteines

101 Residue C4 C43 C68 Prediction for Triparedoxin

102 Prediction for Triparedoxin NN Output NN pred Residue B F C B C B C B

103 Prediction for Triparedoxin Begi n NN Output NN pred HMM HMM pred Residue B F Viterbi path C B 2 B C B 4 B C B 1 F End

104 Performance Neural Network Table I. Performance of the NN predictor (2-fold cross validation) Set Q2 C Q(B) Q(F) P(B) P(F) Q2prot WD RD Table II. Performance of the Hidden NN predictor (2-fold cross validation) Set Q2 C Q(B) Q(F) P(B) P(F) Q2prot WD RD Hybrid system B= cysteine bonding state, F=cysteine free state. WD= whole database (969 proteins, 4136 cysteines) RD= Reduced database, in which the chains containing only one cysteine are removed (782 proteins, 3949 cysteines). Martelli PL, Fariselli P, Malaguti L, Casadio R. -Prediction of the disulfide bonding state of cysteines in proteins with hidden neural networks- Protein Eng. 15: (22)

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure Bioch/BIMS 503 Lecture 2 Structure and Function of Proteins August 28, 2008 Robert Nakamoto rkn3c@virginia.edu 2-0279 Secondary Structure Φ Ψ angles determine protein structure Φ Ψ angles are restricted

More information

Major Types of Association of Proteins with Cell Membranes. From Alberts et al

Major Types of Association of Proteins with Cell Membranes. From Alberts et al Major Types of Association of Proteins with Cell Membranes From Alberts et al Proteins Are Polymers of Amino Acids Peptide Bond Formation Amino Acid central carbon atom to which are attached amino group

More information

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods Cell communication channel Bioinformatics Methods Iosif Vaisman Email: ivaisman@gmu.edu SEQUENCE STRUCTURE DNA Sequence Protein Sequence Protein Structure Protein structure ATGAAATTTGGAAACTTCCTTCTCACTTATCAGCCACCT...

More information

Prediction of structural and functional. residue sequence INTRODUCTION TO NEURAL

Prediction of structural and functional. residue sequence INTRODUCTION TO NEURAL Prediction of structural and functional features in proteins starting from the residue sequence INTRODUCTION TO NEURAL NETWORKS MAPPING PROBLEMS: Secondary structure Covalent structure TTCCPSIVARSNFNVCRLPGTPEAICATYTGCIIIPGATCPGDYAN

More information

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS Int. J. LifeSc. Bt & Pharm. Res. 2012 Kaladhar, 2012 Research Paper ISSN 2250-3137 www.ijlbpr.com Vol.1, Issue. 1, January 2012 2012 IJLBPR. All Rights Reserved PROTEIN SECONDARY STRUCTURE PREDICTION:

More information

Physiochemical Properties of Residues

Physiochemical Properties of Residues Physiochemical Properties of Residues Various Sources C N Cα R Slide 1 Conformational Propensities Conformational Propensity is the frequency in which a residue adopts a given conformation (in a polypeptide)

More information

SUPPLEMENTARY MATERIALS

SUPPLEMENTARY MATERIALS SUPPLEMENTARY MATERIALS Enhanced Recognition of Transmembrane Protein Domains with Prediction-based Structural Profiles Baoqiang Cao, Aleksey Porollo, Rafal Adamczak, Mark Jarrell and Jaroslaw Meller Contact:

More information

Structure Prediction of Membrane Proteins. Introduction. Secondary Structure Prediction and Transmembrane Segments Topology Prediction

Structure Prediction of Membrane Proteins. Introduction. Secondary Structure Prediction and Transmembrane Segments Topology Prediction Review Structure Prediction of Membrane Proteins Chunlong Zhou 1, Yao Zheng 2, and Yan Zhou 1 * 1 Hangzhou Genomics Institute/James D. Watson Institute of Genome Sciences, Zhejiang University/Key Laboratory

More information

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Protein Secondary Structure Prediction using Feed-Forward Neural Network COPYRIGHT 2010 JCIT, ISSN 2078-5828 (PRINT), ISSN 2218-5224 (ONLINE), VOLUME 01, ISSUE 01, MANUSCRIPT CODE: 100713 Protein Secondary Structure Prediction using Feed-Forward Neural Network M. A. Mottalib,

More information

Review. Membrane proteins. Membrane transport

Review. Membrane proteins. Membrane transport Quiz 1 For problem set 11 Q1, you need the equation for the average lateral distance transversed (s) of a molecule in the membrane with respect to the diffusion constant (D) and time (t). s = (4 D t) 1/2

More information

Protein Structure Prediction and Display

Protein Structure Prediction and Display Protein Structure Prediction and Display Goal Take primary structure (sequence) and, using rules derived from known structures, predict the secondary structure that is most likely to be adopted by each

More information

Protein Structures: Experiments and Modeling. Patrice Koehl

Protein Structures: Experiments and Modeling. Patrice Koehl Protein Structures: Experiments and Modeling Patrice Koehl Structural Bioinformatics: Proteins Proteins: Sources of Structure Information Proteins: Homology Modeling Proteins: Ab initio prediction Proteins:

More information

Packing of Secondary Structures

Packing of Secondary Structures 7.88 Lecture Notes - 4 7.24/7.88J/5.48J The Protein Folding and Human Disease Professor Gossard Retrieving, Viewing Protein Structures from the Protein Data Base Helix helix packing Packing of Secondary

More information

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Introduction to Comparative Protein Modeling. Chapter 4 Part I Introduction to Comparative Protein Modeling Chapter 4 Part I 1 Information on Proteins Each modeling study depends on the quality of the known experimental data. Basis of the model Search in the literature

More information

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics

Protein Structure. Role of (bio)informatics in drug discovery. Bioinformatics Bioinformatics Protein Structure Principles & Architecture Marjolein Thunnissen Dep. of Biochemistry & Structural Biology Lund University September 2011 Homology, pattern and 3D structure searches need

More information

Bioinformatics: Secondary Structure Prediction

Bioinformatics: Secondary Structure Prediction Bioinformatics: Secondary Structure Prediction Prof. David Jones d.jones@cs.ucl.ac.uk LMLSTQNPALLKRNIIYWNNVALLWEAGSD The greatest unsolved problem in molecular biology:the Protein Folding Problem? Entries

More information

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer 2013 9. Protein Structure Prediction I Structure Prediction Overview Overview of problem variants Secondary structure prediction

More information

Any protein that can be labelled by both procedures must be a transmembrane protein.

Any protein that can be labelled by both procedures must be a transmembrane protein. 1. What kind of experimental evidence would indicate that a protein crosses from one side of the membrane to the other? Regions of polypeptide part exposed on the outside of the membrane can be probed

More information

Heteropolymer. Mostly in regular secondary structure

Heteropolymer. Mostly in regular secondary structure Heteropolymer - + + - Mostly in regular secondary structure 1 2 3 4 C >N trace how you go around the helix C >N C2 >N6 C1 >N5 What s the pattern? Ci>Ni+? 5 6 move around not quite 120 "#$%&'!()*(+2!3/'!4#5'!1/,#64!#6!,6!

More information

CAP 5510 Lecture 3 Protein Structures

CAP 5510 Lecture 3 Protein Structures CAP 5510 Lecture 3 Protein Structures Su-Shing Chen Bioinformatics CISE 8/19/2005 Su-Shing Chen, CISE 1 Protein Conformation 8/19/2005 Su-Shing Chen, CISE 2 Protein Conformational Structures Hydrophobicity

More information

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models Last time Domains Hidden Markov Models Today Secondary structure Transmembrane proteins Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL

More information

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure Last time Today Domains Hidden Markov Models Structure prediction NAD-specific glutamate dehydrogenase Hard Easy >P24295 DHE2_CLOSY MSKYVDRVIAEVEKKYADEPEFVQTVEEVL SSLGPVVDAHPEYEEVALLERMVIPERVIE FRVPWEDDNGKVHVNTGYRVQFNGAIGPYK

More information

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder HMM applications Applications of HMMs Gene finding Pairwise alignment (pair HMMs) Characterizing protein families (profile HMMs) Predicting membrane proteins, and membrane protein topology Gene finding

More information

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary

What makes a good graphene-binding peptide? Adsorption of amino acids and peptides at aqueous graphene interfaces: Electronic Supplementary Electronic Supplementary Material (ESI) for Journal of Materials Chemistry B. This journal is The Royal Society of Chemistry 21 What makes a good graphene-binding peptide? Adsorption of amino acids and

More information

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha Outline Goal is to predict secondary structure of a protein from its sequence Artificial Neural Network used for this

More information

Proteins: Characteristics and Properties of Amino Acids

Proteins: Characteristics and Properties of Amino Acids SBI4U:Biochemistry Macromolecules Eachaminoacidhasatleastoneamineandoneacidfunctionalgroupasthe nameimplies.thedifferentpropertiesresultfromvariationsinthestructuresof differentrgroups.thergroupisoftenreferredtoastheaminoacidsidechain.

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms

More information

Model Mélange. Physical Models of Peptides and Proteins

Model Mélange. Physical Models of Peptides and Proteins Model Mélange Physical Models of Peptides and Proteins In the Model Mélange activity, you will visit four different stations each featuring a variety of different physical models of peptides or proteins.

More information

Bioinformatics Practical for Biochemists

Bioinformatics Practical for Biochemists Bioinformatics Practical for Biochemists Andrei Lupas, Birte Höcker, Steffen Schmidt WS 2013/14 03. Sequence Features Targeting proteins signal peptide targets proteins to the secretory pathway N-terminal

More information

Peptides And Proteins

Peptides And Proteins Kevin Burgess, May 3, 2017 1 Peptides And Proteins from chapter(s) in the recommended text A. Introduction B. omenclature And Conventions by amide bonds. on the left, right. 2 -terminal C-terminal triglycine

More information

The Structure of Enzymes!

The Structure of Enzymes! The Structure of Enzymes Levels of Protein Structure 0 order amino acid composition Primary Secondary Motifs Tertiary Domains Quaternary ther sequence repeating structural patterns defined by torsion angles

More information

The Structure of Enzymes!

The Structure of Enzymes! The Structure of Enzymes Levels of Protein Structure 0 order amino acid composition Primary Secondary Motifs Tertiary Domains Quaternary ther sequence repeating structural patterns defined by torsion angles

More information

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy Burkhard Rost and Chris Sander By Kalyan C. Gopavarapu 1 Presentation Outline Major Terminology Problem Method

More information

Protein structure alignments

Protein structure alignments Protein structure alignments Proteins that fold in the same way, i.e. have the same fold are often homologs. Structure evolves slower than sequence Sequence is less conserved than structure If BLAST gives

More information

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like SCOP all-β class 4-helical cytokines T4 endonuclease V all-α class, 3 different folds Globin-like TIM-barrel fold α/β class Profilin-like fold α+β class http://scop.mrc-lmb.cam.ac.uk/scop CATH Class, Architecture,

More information

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions

Advanced Certificate in Principles in Protein Structure. You will be given a start time with your exam instructions BIRKBECK COLLEGE (University of London) Advanced Certificate in Principles in Protein Structure MSc Structural Molecular Biology Date: Thursday, 1st September 2011 Time: 3 hours You will be given a start

More information

Sequential resonance assignments in (small) proteins: homonuclear method 2º structure determination

Sequential resonance assignments in (small) proteins: homonuclear method 2º structure determination Lecture 9 M230 Feigon Sequential resonance assignments in (small) proteins: homonuclear method 2º structure determination Reading resources v Roberts NMR of Macromolecules, Chap 4 by Christina Redfield

More information

BIOINFORMATICS. Enhanced Recognition of Protein Transmembrane Domains with Prediction-based Structural Profiles

BIOINFORMATICS. Enhanced Recognition of Protein Transmembrane Domains with Prediction-based Structural Profiles BIOINFORMATICS Vol.? no.? 200? Pages 1 1 Enhanced Recognition of Protein Transmembrane Domains with Prediction-based Structural Profiles Baoqiang Cao 2, Aleksey Porollo 1, Rafal Adamczak 1, Mark Jarrell

More information

Central Dogma. modifications genome transcriptome proteome

Central Dogma. modifications genome transcriptome proteome entral Dogma DA ma protein post-translational modifications genome transcriptome proteome 83 ierarchy of Protein Structure 20 Amino Acids There are 20 n possible sequences for a protein of n residues!

More information

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics B Ramachandran Plot ~b b 135 b ~b ~l l Psi (degrees) 5-5 a A ~a L - -135 SER HIS (F) 59 (G) SER (B) ~b b LYS ASP ASP 315 13 13 (A) (F) (B) LYS ALA ALA 315 173 (E) 173 (E)(A) ~p p ~b - -135 - -5 5 135 (degrees)

More information

Improved Protein Secondary Structure Prediction

Improved Protein Secondary Structure Prediction Improved Protein Secondary Structure Prediction Secondary Structure Prediction! Given a protein sequence a 1 a 2 a N, secondary structure prediction aims at defining the state of each amino acid ai as

More information

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27

Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase Cwc27 Acta Cryst. (2014). D70, doi:10.1107/s1399004714021695 Supporting information Volume 70 (2014) Supporting information for article: Structure and evolution of the spliceosomal peptidyl-prolyl cistrans isomerase

More information

Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached?

Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached? 550 Current Protein and Peptide Science, 2010, 11, 550-561 Topology Prediction of Helical Transmembrane Proteins: How Far Have We Reached? Gábor E. Tusnády and István Simon* Institute of Enzymology, BRC,

More information

BIRKBECK COLLEGE (University of London)

BIRKBECK COLLEGE (University of London) BIRKBECK COLLEGE (University of London) SCHOOL OF BIOLOGICAL SCIENCES M.Sc. EXAMINATION FOR INTERNAL STUDENTS ON: Postgraduate Certificate in Principles of Protein Structure MSc Structural Molecular Biology

More information

Getting To Know Your Protein

Getting To Know Your Protein Getting To Know Your Protein Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist Whitehead Institute for Biomedical Research

More information

Supporting information to: Time-resolved observation of protein allosteric communication. Sebastian Buchenberg, Florian Sittel and Gerhard Stock 1

Supporting information to: Time-resolved observation of protein allosteric communication. Sebastian Buchenberg, Florian Sittel and Gerhard Stock 1 Supporting information to: Time-resolved observation of protein allosteric communication Sebastian Buchenberg, Florian Sittel and Gerhard Stock Biomolecular Dynamics, Institute of Physics, Albert Ludwigs

More information

Protein Secondary Structure Prediction

Protein Secondary Structure Prediction part of Bioinformatik von RNA- und Proteinstrukturen Computational EvoDevo University Leipzig Leipzig, SS 2011 the goal is the prediction of the secondary structure conformation which is local each amino

More information

Basics of protein structure

Basics of protein structure Today: 1. Projects a. Requirements: i. Critical review of one paper ii. At least one computational result b. Noon, Dec. 3 rd written report and oral presentation are due; submit via email to bphys101@fas.harvard.edu

More information

Problem Set 1

Problem Set 1 2006 7.012 Problem Set 1 Due before 5 PM on FRIDAY, September 15, 2006. Turn answers in to the box outside of 68-120. PLEASE WRITE YOUR ANSWERS ON THIS PRINTOUT. 1. For each of the following parts, pick

More information

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues Programme 8.00-8.20 Last week s quiz results + Summary 8.20-9.00 Fold recognition 9.00-9.15 Break 9.15-11.20 Exercise: Modelling remote homologues 11.20-11.40 Summary & discussion 11.40-12.00 Quiz 1 Feedback

More information

Computer simulations of protein folding with a small number of distance restraints

Computer simulations of protein folding with a small number of distance restraints Vol. 49 No. 3/2002 683 692 QUARTERLY Computer simulations of protein folding with a small number of distance restraints Andrzej Sikorski 1, Andrzej Kolinski 1,2 and Jeffrey Skolnick 2 1 Department of Chemistry,

More information

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction

Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction Name page 1 of 6 Bahnson Biochemistry Cume, April 8, 2006 The Structural Biology of Signal Transduction Part I. The ion Ca 2+ can function as a 2 nd messenger. Pick a specific signal transduction pathway

More information

Properties of amino acids in proteins

Properties of amino acids in proteins Properties of amino acids in proteins one of the primary roles of DNA (but not the only one!) is to code for proteins A typical bacterium builds thousands types of proteins, all from ~20 amino acids repeated

More information

Chapter 4: Amino Acids

Chapter 4: Amino Acids Chapter 4: Amino Acids All peptides and polypeptides are polymers of alpha-amino acids. lipid polysaccharide enzyme 1940s 1980s. Lipids membrane 1960s. Polysaccharide Are energy metabolites and many of

More information

) P = 1 if exp # " s. + 0 otherwise

) P = 1 if exp #  s. + 0 otherwise Supplementary Material Monte Carlo algorithm procedures. The Monte Carlo conformational search algorithm has been successfully applied by programs dedicated to finding new folds (Jones 2001; Rohl, Strauss,

More information

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009 9 Protein tertiary structure Sources for this chapter, which are all recommended reading: D.W. Mount. Bioinformatics: Sequences and Genome

More information

PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES

PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES by Lipontseng Cecilia Tsilo A thesis submitted to Rhodes University in partial fulfillment of the requirements for

More information

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure Structure prediction, fold recognition and homology modelling Marjolein Thunnissen Lund September 2012 Steps in protein modelling 3-D structure known Comparative Modelling Sequence of interest Similarity

More information

Membrane Protein Channels

Membrane Protein Channels Membrane Protein Channels Potassium ions queuing up in the potassium channel Pumps: 1000 s -1 Channels: 1000000 s -1 Pumps & Channels The lipid bilayer of biological membranes is intrinsically impermeable

More information

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov

Read more about Pauling and more scientists at: Profiles in Science, The National Library of Medicine, profiles.nlm.nih.gov 2018 Biochemistry 110 California Institute of Technology Lecture 2: Principles of Protein Structure Linus Pauling (1901-1994) began his studies at Caltech in 1922 and was directed by Arthur Amos oyes to

More information

Section Week 3. Junaid Malek, M.D.

Section Week 3. Junaid Malek, M.D. Section Week 3 Junaid Malek, M.D. Biological Polymers DA 4 monomers (building blocks), limited structure (double-helix) RA 4 monomers, greater flexibility, multiple structures Proteins 20 Amino Acids,

More information

Computational Genomics and Molecular Biology, Fall

Computational Genomics and Molecular Biology, Fall Computational Genomics and Molecular Biology, Fall 2014 1 HMM Lecture Notes Dannie Durand and Rose Hoberman November 6th Introduction In the last few lectures, we have focused on three problems related

More information

Membrane proteins Porins: FadL. Oriol Solà, Dimitri Ivancic, Daniel Folch, Marc Olivella

Membrane proteins Porins: FadL. Oriol Solà, Dimitri Ivancic, Daniel Folch, Marc Olivella Membrane proteins Porins: FadL Oriol Solà, Dimitri Ivancic, Daniel Folch, Marc Olivella INDEX 1. INTRODUCTION TO MEMBRANE PROTEINS 2. FADL: OUTER MEMBRANE TRANSPORT PROTEIN 3. MAIN FEATURES OF FADL STRUCTURE

More information

Bioinformatics: Secondary Structure Prediction

Bioinformatics: Secondary Structure Prediction Bioinformatics: Secondary Structure Prediction Prof. David Jones d.t.jones@ucl.ac.uk Possibly the greatest unsolved problem in molecular biology: The Protein Folding Problem MWMPPRPEEVARK LRRLGFVERMAKG

More information

Translation. A ribosome, mrna, and trna.

Translation. A ribosome, mrna, and trna. Translation The basic processes of translation are conserved among prokaryotes and eukaryotes. Prokaryotic Translation A ribosome, mrna, and trna. In the initiation of translation in prokaryotes, the Shine-Dalgarno

More information

Viewing and Analyzing Proteins, Ligands and their Complexes 2

Viewing and Analyzing Proteins, Ligands and their Complexes 2 2 Viewing and Analyzing Proteins, Ligands and their Complexes 2 Overview Viewing the accessible surface Analyzing the properties of proteins containing thousands of atoms is best accomplished by representing

More information

UNIT TWELVE. a, I _,o "' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I "-oh

UNIT TWELVE. a, I _,o ' I I I. I I.P. l'o. H-c-c. I ~o I ~ I / H HI oh H...- I II I II 'oh. HO\HO~ I -oh UNT TWELVE PROTENS : PEPTDE BONDNG AND POLYPEPTDES 12 CONCEPTS Many proteins are important in biological structure-for example, the keratin of hair, collagen of skin and leather, and fibroin of silk. Other

More information

TMSEG Michael Bernhofer, Jonas Reeb pp1_tmseg

TMSEG Michael Bernhofer, Jonas Reeb pp1_tmseg title: short title: TMSEG Michael Bernhofer, Jonas Reeb pp1_tmseg lecture: Protein Prediction 1 (for Computational Biology) Protein structure TUM summer semester 09.06.2016 1 Last time 2 3 Yet another

More information

Prediction. Emily Wei Xu. A thesis. presented to the University of Waterloo. in fulfillment of the. thesis requirement for the degree of

Prediction. Emily Wei Xu. A thesis. presented to the University of Waterloo. in fulfillment of the. thesis requirement for the degree of The Use of Internal and External Functional Domains to Improve Transmembrane Protein Topology Prediction by Emily Wei Xu A thesis presented to the University of Waterloo in fulfillment of the thesis requirement

More information

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb

Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description. Version Document Published by the wwpdb Protein Data Bank Contents Guide: Atomic Coordinate Entry Format Description Version 3.30 Document Published by the wwpdb This format complies with the PDB Exchange Dictionary (PDBx) http://mmcif.pdb.org/dictionaries/mmcif_pdbx.dic/index/index.html.

More information

Resonance assignments in proteins. Christina Redfield

Resonance assignments in proteins. Christina Redfield Resonance assignments in proteins Christina Redfield 1. Introduction The assignment of resonances in the complex NMR spectrum of a protein is the first step in any study of protein structure, function

More information

1. What is an ångstrom unit, and why is it used to describe molecular structures?

1. What is an ångstrom unit, and why is it used to describe molecular structures? 1. What is an ångstrom unit, and why is it used to describe molecular structures? The ångstrom unit is a unit of distance suitable for measuring atomic scale objects. 1 ångstrom (Å) = 1 10-10 m. The diameter

More information

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries Betty Yee Man Cheng 1, Jaime G. Carbonell 1, and Judith Klein-Seetharaman 1, 2 1 Language Technologies

More information

Protein Structure Prediction

Protein Structure Prediction Protein Structure Prediction Michael Feig MMTSB/CTBP 2006 Summer Workshop From Sequence to Structure SEALGDTIVKNA Ab initio Structure Prediction Protocol Amino Acid Sequence Conformational Sampling to

More information

Exam I Answer Key: Summer 2006, Semester C

Exam I Answer Key: Summer 2006, Semester C 1. Which of the following tripeptides would migrate most rapidly towards the negative electrode if electrophoresis is carried out at ph 3.0? a. gly-gly-gly b. glu-glu-asp c. lys-glu-lys d. val-asn-lys

More information

Lecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction.

Lecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction. C N T R F O R N T G R A T V B O N F O R M A T C S V U Master Course DNA/Protein Structurefunction Analysis and Prediction Lecture 7 Protein Secondary Structure Prediction Protein primary structure 20 amino

More information

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two

Supplementary Figure 3 a. Structural comparison between the two determined structures for the IL 23:MA12 complex. The overall RMSD between the two Supplementary Figure 1. Biopanningg and clone enrichment of Alphabody binders against human IL 23. Positive clones in i phage ELISA with optical density (OD) 3 times higher than background are shown for

More information

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell

Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell Using Higher Calculus to Study Biologically Important Molecules Julie C. Mitchell Mathematics and Biochemistry University of Wisconsin - Madison 0 There Are Many Kinds Of Proteins The word protein comes

More information

7 Protein secondary structure

7 Protein secondary structure 78 Grundlagen der Bioinformatik, SS 1, D. Huson, June 17, 21 7 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze,

More information

7.012 Problem Set 1. i) What are two main differences between prokaryotic cells and eukaryotic cells?

7.012 Problem Set 1. i) What are two main differences between prokaryotic cells and eukaryotic cells? ame 7.01 Problem Set 1 Section Question 1 a) What are the four major types of biological molecules discussed in lecture? Give one important function of each type of biological molecule in the cell? b)

More information

Supplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution

Supplemental Materials for. Structural Diversity of Protein Segments Follows a Power-law Distribution Supplemental Materials for Structural Diversity of Protein Segments Follows a Power-law Distribution Yoshito SAWADA and Shinya HONDA* National Institute of Advanced Industrial Science and Technology (AIST),

More information

Basic Principles of Protein Structures

Basic Principles of Protein Structures Basic Principles of Protein Structures Proteins Proteins: The Molecule of Life Proteins: Building Blocks Proteins: Secondary Structures Proteins: Tertiary and Quartenary Structure Proteins: Geometry Proteins

More information

Supersecondary Structures (structural motifs)

Supersecondary Structures (structural motifs) Supersecondary Structures (structural motifs) Various Sources Slide 1 Supersecondary Structures (Motifs) Supersecondary Structures (Motifs): : Combinations of secondary structures in specific geometric

More information

β1 Structure Prediction and Validation

β1 Structure Prediction and Validation 13 Chapter 2 β1 Structure Prediction and Validation 2.1 Overview Over several years, GPCR prediction methods in the Goddard lab have evolved to keep pace with the changing field of GPCR structure. Despite

More information

Details of Protein Structure

Details of Protein Structure Details of Protein Structure Function, evolution & experimental methods Thomas Blicher, Center for Biological Sequence Analysis Anne Mølgaard, Kemisk Institut, Københavns Universitet Learning Objectives

More information

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

1-D Predictions. Prediction of local features: Secondary structure & surface exposure 1-D Predictions Prediction of local features: Secondary structure & surface exposure 1 Learning Objectives After today s session you should be able to: Explain the meaning and usage of the following local

More information

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison CMPS 6630: Introduction to Computational Biology and Bioinformatics Structure Comparison Protein Structure Comparison Motivation Understand sequence and structure variability Understand Domain architecture

More information

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran

Geometrical Concept-reduction in conformational space.and his Φ-ψ Map. G. N. Ramachandran Geometrical Concept-reduction in conformational space.and his Φ-ψ Map G. N. Ramachandran Communication paths in trna-synthetase: Insights from protein structure networks and MD simulations Saraswathi Vishveshwara

More information

ALL LECTURES IN SB Introduction

ALL LECTURES IN SB Introduction 1. Introduction 2. Molecular Architecture I 3. Molecular Architecture II 4. Molecular Simulation I 5. Molecular Simulation II 6. Bioinformatics I 7. Bioinformatics II 8. Prediction I 9. Prediction II ALL

More information

8 Protein secondary structure

8 Protein secondary structure Grundlagen der Bioinformatik, SoSe 11, D. Huson, June 6, 211 13 8 Protein secondary structure Sources for this chapter, which are all recommended reading: Introduction to Protein Structure, Branden & Tooze,

More information

B O C 4 H 2 O O. NOTE: The reaction proceeds with a carbonium ion stabilized on the C 1 of sugar A.

B O C 4 H 2 O O. NOTE: The reaction proceeds with a carbonium ion stabilized on the C 1 of sugar A. hbcse 33 rd International Page 101 hemistry lympiad Preparatory 05/02/01 Problems d. In the hydrolysis of the glycosidic bond, the glycosidic bridge oxygen goes with 4 of the sugar B. n cleavage, 18 from

More information

Protein secondary structure prediction with a neural network

Protein secondary structure prediction with a neural network Proc. Nati. Acad. Sci. USA Vol. 86, pp. 152-156, January 1989 Biophysics Protein secondary structure prediction with a neural network L. HOWARD HOLLEY AND MARTIN KARPLUS Department of Chemistry, Harvard

More information

Comparison between Bacteriorhodopsin and Halorhodopsin. Halorhodopsin (HR) and Bacteriorhodopsin (BR) belong to a subfamily of

Comparison between Bacteriorhodopsin and Halorhodopsin. Halorhodopsin (HR) and Bacteriorhodopsin (BR) belong to a subfamily of Comparison between Bacteriorhodopsin and Halorhodopsin Halorhodopsin (HR) and Bacteriorhodopsin (BR) belong to a subfamily of heptahelical membrane proteins, the archaeal rhodopsins. They are found in

More information

Hidden symmetries in primary sequences of small α proteins

Hidden symmetries in primary sequences of small α proteins Hidden symmetries in primary sequences of small α proteins Ruizhen Xu, Yanzhao Huang, Mingfen Li, Hanlin Chen, and Yi Xiao * Biomolecular Physics and Modeling Group, Department of Physics, Huazhong University

More information

Hidden Markov Models (I)

Hidden Markov Models (I) GLOBEX Bioinformatics (Summer 2015) Hidden Markov Models (I) a. The model b. The decoding: Viterbi algorithm Hidden Markov models A Markov chain of states At each state, there are a set of possible observables

More information

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics Jianlin Cheng, PhD Department of Computer Science University of Missouri, Columbia

More information

Chem 250 Evening Exam 2

Chem 250 Evening Exam 2 Page 1 of 10 Evening Exam 2 ame:: Chem 250 Evening Exam 2 This exam is composed of 40 questions. As discussed in the course syllabus, honesty and integrity are absolute essentials for this class. In fairness

More information

Supplementary Information Intrinsic Localized Modes in Proteins

Supplementary Information Intrinsic Localized Modes in Proteins Supplementary Information Intrinsic Localized Modes in Proteins Adrien Nicolaï 1,, Patrice Delarue and Patrick Senet, 1 Department of Physics, Applied Physics and Astronomy, Rensselaer Polytechnic Institute,

More information

Course Notes: Topics in Computational. Structural Biology.

Course Notes: Topics in Computational. Structural Biology. Course Notes: Topics in Computational Structural Biology. Bruce R. Donald June, 2010 Copyright c 2012 Contents 11 Computational Protein Design 1 11.1 Introduction.........................................

More information

Protein Structure Bioinformatics Introduction

Protein Structure Bioinformatics Introduction 1 Swiss Institute of Bioinformatics Protein Structure Bioinformatics Introduction Basel, 27. September 2004 Torsten Schwede Biozentrum - Universität Basel Swiss Institute of Bioinformatics Klingelbergstr

More information