Protein Secondary Structure Prediction

Similar documents
Basics of protein structure

Protein Secondary Structure Prediction

Protein Structure Prediction and Display

Bioinformatics III Structural Bioinformatics and Genome Analysis Part Protein Secondary Structure Prediction. Sepp Hochreiter

Physiochemical Properties of Residues

CAP 5510 Lecture 3 Protein Structures

Presentation Outline. Prediction of Protein Secondary Structure using Neural Networks at Better than 70% Accuracy

SUPPLEMENTARY MATERIALS

Protein Secondary Structure Assignment and Prediction

Improved Protein Secondary Structure Prediction

Protein Structures: Experiments and Modeling. Patrice Koehl

Orientational degeneracy in the presence of one alignment tensor.

Protein structure. Protein structure. Amino acid residue. Cell communication channel. Bioinformatics Methods

Protein Secondary Structure Prediction using Feed-Forward Neural Network

Bayesian Protein Structure Prediction

BIOINF 4120 Bioinformatics 2 - Structures and Systems - Oliver Kohlbacher Summer Protein Structure Prediction I

HIV protease inhibitor. Certain level of function can be found without structure. But a structure is a key to understand the detailed mechanism.

Neural Networks for Protein Structure Prediction Brown, JMB CS 466 Saurabh Sinha

Supersecondary Structures (structural motifs)

Introduction to Comparative Protein Modeling. Chapter 4 Part I

Protein Structure Prediction II Lecturer: Serafim Batzoglou Scribe: Samy Hamdouche

Conditional Graphical Models

PROTEIN SECONDARY STRUCTURE PREDICTION USING NEURAL NETWORKS AND SUPPORT VECTOR MACHINES

Statistical Machine Learning Methods for Bioinformatics IV. Neural Network & Deep Learning Applications in Bioinformatics

Getting To Know Your Protein

Steps in protein modelling. Structure prediction, fold recognition and homology modelling. Basic principles of protein structure

Protein Structure. Hierarchy of Protein Structure. Tertiary structure. independently stable structural unit. includes disulfide bonds

HMM applications. Applications of HMMs. Gene finding with HMMs. Using the gene finder

Giri Narasimhan. CAP 5510: Introduction to Bioinformatics. ECS 254; Phone: x3748

Protein Secondary Structure Prediction using Pattern Recognition Neural Network

Bioinformatics: Secondary Structure Prediction

Protein Structure Prediction, Engineering & Design CHEM 430

Optimization of the Sliding Window Size for Protein Structure Prediction

BCB 444/544 Fall 07 Dobbs 1

Lecture 7. Protein Secondary Structure Prediction. Secondary Structure DSSP. Master Course DNA/Protein Structurefunction.

7 Protein secondary structure

PROTEIN SECONDARY STRUCTURE PREDICTION: AN APPLICATION OF CHOU-FASMAN ALGORITHM IN A HYPOTHETICAL PROTEIN OF SARS VIRUS

Gibbs Sampling Methods for Multiple Sequence Alignment

Proteins: Structure & Function. Ulf Leser

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

114 Grundlagen der Bioinformatik, SS 09, D. Huson, July 6, 2009

Sequence analysis and comparison

Graphical Models and Bayesian Methods in Bioinformatics: From Structural to Systems Biology

Research Article Extracting Physicochemical Features to Predict Protein Secondary Structure

Useful background reading

Improving Protein 3D Structure Prediction Accuracy using Dense Regions Areas of Secondary Structures in the Contact Map

Bayesian Segmental Models with Multiple Sequence Alignment Profiles for Protein Secondary Structure and Contact Map Prediction

8 Protein secondary structure

Protein Structure. W. M. Grogan, Ph.D. OBJECTIVES

Computational Molecular Biology (

Protein Structure Prediction Using Multiple Artificial Neural Network Classifier *

3D Structure. Prediction & Assessment Pt. 2. David Wishart 3-41 Athabasca Hall

Amino Acid Structures from Klug & Cummings. 10/7/2003 CAP/CGS 5991: Lecture 7 1

1-D Predictions. Prediction of local features: Secondary structure & surface exposure

Bioinformatics: Secondary Structure Prediction

Secondary Structure. Bioch/BIMS 503 Lecture 2. Structure and Function of Proteins. Further Reading. Φ, Ψ angles alone determine protein structure

Intro Secondary structure Transmembrane proteins Function End. Last time. Domains Hidden Markov Models

Presenter: She Zhang

Protein Structure Prediction

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Today. Last time. Secondary structure Transmembrane proteins. Domains Hidden Markov Models. Structure prediction. Secondary structure

RNA and Protein Structure Prediction

Protein Structures. Sequences of amino acid residues 20 different amino acids. Quaternary. Primary. Tertiary. Secondary. 10/8/2002 Lecture 12 1

12 Protein secondary structure

Two-Stage Multi-Class Support Vector Machines to Protein Secondary Structure Prediction. M.N. Nguyen and J.C. Rajapakse

Protein structure alignments

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Packing of Secondary Structures

Page 1. References. Hidden Markov models and multiple sequence alignment. Markov chains. Probability review. Example. Markovian sequence

Structural Alignment of Proteins

Algorithms in Bioinformatics FOUR Pairwise Sequence Alignment. Pairwise Sequence Alignment. Convention: DNA Sequences 5. Sequence Alignment

Genomics and bioinformatics summary. Finding genes -- computer searches

Protein Structure: Data Bases and Classification Ingo Ruczinski

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Bayesian Models and Algorithms for Protein Beta-Sheet Prediction

Predicting Secondary Structures of Proteins

Protein Structure & Motifs

Identification of Representative Protein Sequence and Secondary Structure Prediction Using SVM Approach

ALL LECTURES IN SB Introduction

Predicting Protein Structural Features With Artificial Neural Networks

Introduction to" Protein Structure

Analysis and Prediction of Protein Structure (I)

Template Free Protein Structure Modeling Jianlin Cheng, PhD

A Machine Text-Inspired Machine Learning Approach for Identification of Transmembrane Helix Boundaries

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment

Protein Secondary Structure Prediction

Homology models of the tetramerization domain of six eukaryotic voltage-gated potassium channels Kv1.1-Kv1.6

Biochemistry Prof. S. DasGupta Department of Chemistry Indian Institute of Technology Kharagpur. Lecture - 06 Protein Structure IV

Protein secondary structure prediction with a neural network

Prediction of Beta Sheets in Proteins

Master s Thesis June 2018 Supervisor: Christian Nørgaard Storm Pedersen Aarhus University

Structural biomathematics: an overview of molecular simulations and protein structure prediction

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor

CAP 5510: Introduction to Bioinformatics CGS 5166: Bioinformatics Tools. Giri Narasimhan

Protein Structure Basics

A Graphical Model for Protein Secondary Structure Prediction

BIOCHEMISTRY Course Outline (Fall, 2011)

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Protein Folding Prof. Eugene Shakhnovich

Transcription:

Protein Secondary Structure Prediction Doug Brutlag & Scott C. Schmidler

Overview Goals and problem definition Existing approaches Classic methods Recent successful approaches Evaluating prediction algorithms Shortcomings of existing approaches Current research

Protein structure prediction Sequence of 984 amino acids: PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIG PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKK KKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGW KGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEE LRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVN DIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAEN REILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARM RGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQA TWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVT NKGRQKVVPLTNTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQP DKSESELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGI PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIG PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKK KKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGW KGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEE LRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVN DIQKLVGKLNWASQIYPGIKVKQLCKLLRGTKALTEVIPLTEEAELELAEN REILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARM RGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQA TWIPEWEFVNTPPLVKLWYQ HIV reverse transcriptase 3D coordinates of 7404 atoms:

Abstracting the problem 3D coords of all atoms: 3D coords of C-α backbone: 3D coords of secondary structure elements: C-α groups:

Secondary structure prediction for protein folding Sequence of amino acids: Predict structural segments: Goal: Recover 3D coords PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEK EGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQD FWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTA FTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFK KQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLT TPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQ KLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAEL ELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQE PFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTP KFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQLE KEPIVGAETFYVDGAANRETKLGKAGYVTNKGRQKVVPLTNTT NQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQPDKSESE LVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGI PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEK EGKISKIGPENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQD FWEVQLGIPHPAGLKKKKSVTVLDVGDAYFSVPLDEDFRKYTA FTIPSINNETPGIRYQYNVLPQGWKGSPAIFQSSMTKILEPFK KQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEELRQHLLRWGLT TPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVNDIQ KLVGKLNWASQIYPGIKVKQLCKLLRGTKALTEVIPLTEEAEL ELAENREILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQE PFKNLKTGKYARMRGAHTNDVKQLTEAVQKITTESIVIWGKTP KFKLPIQKETWETWWTEYWQATWIPEWEFVNTPPLVKLWYQ The secondary structure prediction problem: Given a protein sequence: NWVLST VLSTAADMQGVVTDGMASGLDKD... D... Predict a secondary structure sequence: LLEEEE EEEELLLLHHHHHHHHHHLHHHL... H = α-helix E = Extended β- strand L = Loop/coil

Defining the secondary structure of a protein sequence α-helix and anti-parallel β sheet: Residue Sequence: Secondary Structure: NWVLSTAADMQGVVTDGMASFLDKD...... LLEEEELLLLHHHHHHHHHHLHHHL Fig. 1: Syntactic formulation of secondary structure problem

Abstracted version of protein structure prediction Sequence of 984 amino acids: 3D coords of 179 structural elements: PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIG PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKK KKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGW KGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEE LRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVN DIQKLVGKLNWASQIYPGIKVRQLCKLLRGTKALTEVIPLTEEAELELAEN REILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARM RGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQA TWIPEWEFVNTPPLVKLWYQLEKEPIVGAETFYVDGAANRETKLGKAGYVT NKGRQKVVPLTNTTNQKTELQAIYLALQDSGLEVNIVTDSQYALGIIQAQP DKSESELVNQIIEQLIKKEKVYLAWVPAHKGIGGNEQVDKLVSAGI PISPIETVPVKLKPGMDGPKVKQWPLTEEKIKALVEICTEMEKEGKISKIG PENPYNTPVFAIKKKDSTKWRKLVDFRELNKRTQDFWEVQLGIPHPAGLKK KKSVTVLDVGDAYFSVPLDEDFRKYTAFTIPSINNETPGIRYQYNVLPQGW KGSPAIFQSSMTKILEPFKKQNPDIVIYQYMDDLYVGSDLEIGQHRTKIEE LRQHLLRWGLTTPDKKHQKEPPFLWMGYELHPDKWTVQPIVLPEKDSWTVN DIQKLVGKLNWASQIYPGIKVKQLCKLLRGTKALTEVIPLTEEAELELAEN REILKEPVHGVYYDPSKDLIAEIQKQGQGQWTYQIYQEPFKNLKTGKYARM RGAHTNDVKQLTEAVQKITTESIVIWGKTPKFKLPIQKETWETWWTEYWQA TWIPEWEFVNTPPLVKLWYQ

The Secondary Structure Prediction problem Given a protein sequence: NWVLST VLSTAADMQGVVTDGMASGLDKD... D... Predict a secondary structure sequence: LLEEEE EEEELLLLHHHHHHHHHHLHHHL... 3-state problem: {ARNDCQEGHILKMFPSTWYV} n -> {L,H,E} n

Chou-Fasman method (Chou & Fasman, 1974) Calculate propensity for each amino acid to be in helix, strand, coil ( ) P( A) P A S = n A,S n S n A n Classify by propensity

Chou-Fasman prediction Search for nucleation sites helix nucleation: score > 4 in window of 6 where Propagate to until termination criteria met helix termination: tetrapeptide with mean propensity < 1 Rules for conflict resolution, exceptions Accuracy: 70-80% (< 55% Nishikawa, 83)

GOR (Garnier et al,1978) Calculate information content for each amino acid: I( S; R) = log[ P( S R) P( S) ] (same as CF propensity ) Information difference ( ) = I( S;R) I ( S;R) ( ) P ( S, R) I S; R = log P S, R (likelihood ratio) [ ] + log[ P ( S) P( S) ] Predict max using window of size 17 Accuracy: 64% (< 55% Nishikawa, 83)

Window-based prediction For each position in a protein sequence:...lstaadmqgvvtdgmasgldkd... TDGMASGLDKD... Predict its secondary structure based on a local window:...lstaadmqgv MQGVVTDGMASGLDKD... Slide window along sequence:...lstaadmqgvv QGVVTDGMASGLDKD......LSTAADMQGVVT GVVTDGMASGLDKD... GLDKD......LSTAADMQGVVTD VVTDGMASGLDKD...

Modeling structural correlations NAIVE-BAYES CLASSIFIER Conditional independence models: { L,H,E } A R N D C Q E G H I L K M F P S T W Y P(A H) V... P(A E) P(A L) HELIX STRAND LOOP i-4 i-3 i-2 i-1 i i+1 i+2 i+3 i+4 Pair-wise dependence: A R N D A R N D C Q E G H I L K M F P S T W Y V * * * KLINGER : STRUCTURAL CORRELATIONS... i i+1 i+2 i+3 i+4 W Y V

Hydrolase (β-lactamase)( with amphipathic α-helix

Amphipathic α-helix: hydrophobic side chains

Amphipathic α-helix: side chain periodicity Sequence: NLAKMVVKTAEAILKD

Structural correlations in β-strands 5 4 3 2 1 0-1 -2-3 -4 E E E E

PHD (Rost & Sander, 1993) MULTI-LAYER PERCEPTRON (FULLY-CONNECTED) Neural network based 2 levels: Sequence -> Structure Structure -> Structure i-4 Uses multiple sequence alignment Amino acid frequencies Structure Predictions -> Conservation weight Post-processing by dynamic programming i-3 i-2 H L E...... i-1 i i+1 i+2 i+3 i+4 <-...Amino Acid Sequence... ->

Special case: helical transmembrane proteins Membrane proteins biologically important Difficult to determine experimentally Easier to predict Constraints imposed by lipid bilayer Strong hydrophobicity signal Cytoplasmic residues positively charged 2-state bacteriorhodopsin Accuracy: 95% (Multiple alignment)

Predator (Frishman & Argos, 1996) Nearest-neighbor classifier Represent subsequence as a vector Define a distance metric Find closest vectors in training set, vote Adds non-local terms for hydrogen bonding propensity (helices, sheets) Accuracy: 68% single sequence; 75% multiple alignment

SSPAL (Salamov & Solovyev, 1997) Nearest-neighbor can be viewed as fixed length, non-gapped local alignment Find K (= 50) best non-overlapping local alignments with known structures Predict each position by consensus of alignment structures, weighted by score Accuracy: Single sequence: 71.2% Multiple sequence alignment: 73.5%

Evaluation of secondary structure prediction Large database of protein sequences: Known structures X-ray crystallography, NMR Gold-standard assignment Non-homologous < 25-30% identity Cross-validation

What to measure? Q3 (3-state accuracy) percent residues correct Matthews correlation coefficient adjust for prevalence Segment-based measures Rost et al 94, Taylor 84, Presnell et al 92

Bayesian Segmentation of Protein Secondary Structure Scott C.Schmidler 1,3 Jun S. Liu Douglas L. Brutlag 2 3 1 2 3 Section on Medical Informatics Department of Statistics Department of Biochemistry Stanford University

Goals Improved secondary structure prediction > 70% accuracy (75% MSA) accurate estimates of prediction variability Combining structural data with scientific knowledge

Bayesian structure prediction Model-based structure prediction Probabilistic modeling of segments Hydrophobicity patterns Side chain interactions Helical capping Predict structure to maximize probability Optimal segmentation of protein sequence L E L H L H L Doug Brutlag, 2000... NW VLST AADM QGVVTDGMAS F LDK D...

Model Joint distribution: P( R, S,T ) = P( S,T ) P R [ S j 1 +1:S j ] S, T Conditional independence model for inter- segment residues: Markovian dependence in S,T: ( ) = P T j T j 1 P R, S,T m j =1 m j =1 ( )P( S j S j 1,T j )P R Sj 1 +1:S j [ ] S j 1, S j,t j Example: L E L H L H L R = NWVLSTAADMQGVVTDGMASFLDKD SS = LLEEEELLLLHHHHHHHHHHLHHHL

Position-specific preferences Helix N-cap model, positions 1&2 Helix internal position model 0.18 0.18 0.16 0.16 0.14 0.14 0.12 0.12 0.1 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 A R N D C Q E G H I L K M F P S T W Y V X 0 A R N D C Q E G H I L K M F P S T W Y V Amino acid Amino acid Strand internal position model Loop/coil N-cap model, positions 1&2 0.18 0.18 0.16 0.16 0.14 0.14 0.12 0.12 0.1 0.1 0.08 0.08 0.06 0.06 0.04 0.04 0.02 0.02 0 A R N D C Q E G H I L K M F P S T W Y V 0 A R N D C Q E G H I L K M F P S T W Y V Amino acid Amino acid

Segment likelihood Modeling correlations among properties j ( ) = P Helix R [i] H i P R [k: j] S q 1,S q, Helix P R [k: j] S q 1,S q,strand i = k ( ) = P Strand ( R [i ] H i ) i = k ( ) = P Loop R [i ] H i P R [k: j] S q 1,S q, Loop j ( )P Helix H i H i 2, H i 3,H i 4, H i 7 ( ) P Strand H i H i 2, H i 3 j i =k ( )P Loop H i H i 2 ( ) ( ) H1 H2 H3 H4 H5 H6 R1 R2 R3 R4 R5 R6 HELIX MODEL

Segment length distributions 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Helix Strand Segment length

Probabilistic model Structureand positionspecific frequencies: P( Sequence Structure) P( Structure) 0.2 0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Segment length priors: Helix Strand Segment length Conditional independence of inter-segment residues: L E L H Markovian dependence in segment types: L H L

Benefits Explicit probabilistic model: Sound semantics Coherent treatment of uncertainty/noisy data Explicit, testable assumptions Fully Bayesian prediction: Averaged over all possible models Accurate estimates of uncertainty

Example prediction: Cytochrome C5 (1cc5) True: Predicted:

Prediction confidence 100 95 90 85 80 75 70 65 0.4 0.5 0.6 0.7 0.8 0.9 Prediction threshold 60 50 40 30 20 10 100 90 80 70 60 50 40 30 20 10 0 0.4 0.5 0.6 0.7 0.8 0.9 Prediction threshold 0 0.4 0.5 0.6 0.7 0.8 0.9 Prediction threshold %H %E %L

Evaluation 453 structures selected from the Brookhaven Protein Data Bank < 2.5A resolution < 25% sequence similarity Cross-validation results (single sequence): Bayesian Segmentation algorithm: Marginal mode prediction: 68.8% MAP segmentation prediction: 64.2% Previous best-published: 68% (Frishman&Argos 96) 71% (Salamov&Solovyev 97)

Beyond secondary structure: predicting 3D contacts Model captures local dependencies Structure-specific residue propensities Intra-segment side chain correlations Tertiary interactions β-sheets Coiled-coils Disulfide bonds

β-sheet side chain correlations Odds ratio: P A i, A j Struct ( ) P A i Struct ( )P A j Struct ( ) Charged-pair interaction in Glutaredoxin Disulfide bonds Stabilizing pairs from (Smith & Regan, Science 1995) Hydrophobic side chains

Incorporating non-local information Segment interaction models Replace terms P ( Segment j ) and P Segment k with ( ) P( Segment j,segment k) Parallel β-sheet in 1nzy β H L β L L E L E L H L E L E but computation... L

Prediction of β-strand Contact Map for 5pti Predicted contacts: True contacts: Pairing and register of β-hairpin correctly predicted

Previous Approaches: Hubbard (1995) Predicted contacts (single sequence): Predicted contacts (multiple sequence alignment):

Flavodoxin (5nul) Predicted contacts: True contacts: β-strands well-predicted but poor specificity in strand pairing

Future work Models for subclasses of segments Environment: amphipathic/buried/exposed Structure: 3-10 helices, β, γ turns Model selection Multiple sequence alignment information

Conclusions Probabilistic modeling of protein structure Prediction by segmentation of sequence Independent segment models perform comparably to existing approaches General framework for modeling non-local interactions to predict 3D contacts