Human and Server CAPRI Protein Docking Prediction Using LZerD with Combined Scoring Functions. Daisuke Kihara

Similar documents
Prediction and refinement of NMR structures from sparse experimental data

Using Phase for Pharmacophore Modelling. 5th European Life Science Bootcamp March, 2017

Protein Structure Prediction

Template-Based Modeling of Protein Structure

Contact map guided ab initio structure prediction

Template Free Protein Structure Modeling Jianlin Cheng, PhD

Week 10: Homology Modelling (II) - HHpred

GC and CELPP: Workflows and Insights

Template Free Protein Structure Modeling Jianlin Cheng, PhD

PROTEIN STRUCTURE PREDICTION II

Chemical Shift Restraints Tools and Methods. Andrea Cavalli

Protein surface descriptors for binding sites comparison and ligand prediction

proteins Prediction Methods and Reports

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

proteins Estimating quality of template-based protein models by alignment stability Hao Chen 1 and Daisuke Kihara 1,2,3,4 * INTRODUCTION

Protein Structure Prediction

FlexPepDock In a nutshell

Bioengineering & Bioinformatics Summer Institute, Dept. Computational Biology, University of Pittsburgh, PGH, PA

Mapping Monomeric Threading to Protein Protein Structure Prediction

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

MD Simulation in Pose Refinement and Scoring Using AMBER Workflows

Protein Structure Determination from Pseudocontact Shifts Using ROSETTA

Measuring quaternary structure similarity using global versus local measures.

TASSER: An Automated Method for the Prediction of Protein Tertiary Structures in CASP6

Protein structure prediction. CS/CME/BioE/Biophys/BMI 279 Oct. 10 and 12, 2017 Ron Dror

Docking. GBCB 5874: Problem Solving in GBCB

Programme Last week s quiz results + Summary Fold recognition Break Exercise: Modelling remote homologues

Edge Image Description Using Angular Radial Partitioning

Predicting Continuous Local Structure and the Effect of Its Substitution for Secondary Structure in Fragment-Free Protein Structure Prediction

Performing a Pharmacophore Search using CSD-CrossMiner

Tools for Cryo-EM Map Fitting. Paul Emsley MRC Laboratory of Molecular Biology

DISCRETE TUTORIAL. Agustí Emperador. Institute for Research in Biomedicine, Barcelona APPLICATION OF DISCRETE TO FLEXIBLE PROTEIN-PROTEIN DOCKING:

Ligand Scout Tutorials

proteins Comparison of structure-based and threading-based approaches to protein functional annotation Michal Brylinski, and Jeffrey Skolnick*

Improving the Physical Realism and Structural Accuracy of Protein Models by a Two-Step Atomic-Level Energy Minimization

Supporting Online Material for

Building 3D models of proteins

Protein Structure Prediction, Engineering & Design CHEM 430

As of December 30, 2003, 23,000 solved protein structures

TOUCHSTONE: A Unified Approach to Protein Structure Prediction

proteins REVIEW Sampling and scoring: A marriage made in heaven Sandor Vajda,* David R. Hall, and Dima Kozakov

PL-PatchSurfer: A Novel Molecular Local Surface-Based Method for Exploring Protein-Ligand Interactions

PDB : 1ZTB Ligand : ZINC

Online Protein Structure Analysis with the Bio3D WebApp

The PhilOEsophy. There are only two fundamental molecular descriptors

Overview & Applications. T. Lezon Hands-on Workshop in Computational Biophysics Pittsburgh Supercomputing Center 04 June, 2015

The typical end scenario for those who try to predict protein

Protein quality assessment

Evolutionary design of energy functions for protein structure prediction

A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery

3DRobot: automated generation of diverse and well-packed protein structure decoys

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment

Life Science Webinar Series

Finding Similar Protein Structures Efficiently and Effectively

Assessment of CAPRI predictions 2009

Building a Homology Model of the Transmembrane Domain of the Human Glycine α-1 Receptor

Goals. Structural Analysis of the EGR Family of Transcription Factors: Templates for Predicting Protein DNA Interactions

PDBe TUTORIAL. PDBePISA (Protein Interfaces, Surfaces and Assemblies)

MM-PBSA Validation Study. Trent E. Balius Department of Applied Mathematics and Statistics AMS

Model building and validation for cryo-em maps at low resolution

Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Tertiary Structure Prediction

proteins Prediction Report Template-based modeling and free modeling by I-TASSER in CASP7 Yang Zhang* 108 PROTEINS VC 2007 WILEY-LISS, INC.

CMPS 3110: Bioinformatics. Tertiary Structure Prediction

A New Hidden Markov Model for Protein Quality Assessment Using Compatibility Between Protein Sequence and Structure

Nature Structural and Molecular Biology: doi: /nsmb.2938

Protein Docking by Exploiting Multi-dimensional Energy Funnels

Structure to Function. Molecular Bioinformatics, X3, 2006

On the Importance of the Distance Measures Used to Train and Test Knowledge-Based Potentials for Proteins

SUPPLEMENTARY INFORMATION

Protein NMR Structures Refined with Rosetta Have Higher Accuracy Relative to Corresponding X ray Crystal Structures

Chemical properties that affect binding of enzyme-inhibiting drugs to enzymes

proteins Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field

proteins Effect of using suboptimal alignments in template-based protein structure prediction Hao Chen 1 and Daisuke Kihara 1,2,3 * INTRODUCTION

BIOINFORMATICS TOOLS & ANALYSIS OF PROTEIN STRUCTURE AND FUNCTION FEI JI. (Under the Direction of Ying Xu) ABSTRACT

Improving De novo Protein Structure Prediction using Contact Maps Information

User Guide for LeDock

Modeling and minimizing CAPRI round 30 symmetrical protein complexes from CASP-11 structural models

CS612 - Algorithms in Bioinformatics

2MHR. Protein structure classification is important because it organizes the protein structure universe that is independent of sequence similarity.

Protein Structure Prediction

The Schrödinger KNIME extensions

SCOP. all-β class. all-α class, 3 different folds. T4 endonuclease V. 4-helical cytokines. Globin-like

Structure Investigation of Fam20C, a Golgi Casein Kinase

PREDICTION OF PROTEIN BINDING SITES BY COMBINING SEVERAL METHODS

Identification of correct regions in protein models using structural, alignment, and consensus information

Francisco Melo, Damien Devos, Eric Depiereux and Ernest Feytmans

The reuse of structural data for fragment binding site prediction

Homologous proteins have similar structures and structural superposition means to rotate and translate the structures so that corresponding atoms are

Protein Structure Analysis with Sequential Monte Carlo Method. Jinfeng Zhang Computational Biology Lab Department of Statistics Harvard University

Protein Dynamics. The space-filling structures of myoglobin and hemoglobin show that there are no pathways for O 2 to reach the heme iron.

Mechanistic insight into inhibition of two-component system signaling

Computational chemical biology to address non-traditional drug targets. John Karanicolas

Development and Large Scale Benchmark Testing of the PROSPECTOR_3 Threading Algorithm

Tutorial 2: Analysis of DIA data in Skyline

Introduction to Structure Preparation and Visualization

Molecular modeling. A fragment sequence of 24 residues encompassing the region of interest of WT-

CMPS 6630: Introduction to Computational Biology and Bioinformatics. Structure Comparison

Alpha-helical Topology and Tertiary Structure Prediction of Globular Proteins Scott R. McAllister Christodoulos A. Floudas Princeton University

Induced Fit, Folding, and Recognition of the NF-κB-Nuclear Localization Signals by IκBα and IκBβ

Transcription:

Human and Server CAPRI Protein Docking Prediction Using LZerD with Combined Scoring Functions Daisuke Kihara Department of Biological Sciences Department of Computer Science Purdue University, Indiana, USA http://kiharalab.org 1

CAPRI Round 30 Results (Lensink et al., CAPRI30 group paper, 2016) 2

Overview of Protein Docking Prediction Using LZerD in CAPRI Single Chain Modeling HHPred MUFold TASSER SparksX TASSERlite Phyre2 MultiCom PRESCO Sub-unit models Re-ranking with scoring functions LZerD 10 models ~50,000 docking models Clustering, RMSD < 5 Å MD relaxation Submit 3

(Lizard) LZerD(Local 3D Zernike descriptor-based Docking program) normal vector 6Å 3DZernike descriptor (Venkatraman, Yang, Sael, & Kihara, BMC Bioinformatics, 2009) 4

3D Zernike Descriptors (3DZD) An extension of spherical harmonics based descriptors A 3D object can be represented by a series of orthogonal functions, thus practically represented by a series of coefficients as a feature vector Compact Rotation invariant Z m nl m ( r,, ) R ( r) Y (, ) nl Y (, ) (r) m l Z m nl ( r,, ) l R nl : Spherical harmonics, : radial functions polynomials in Cartesian coordinates Zernike moments: Zernike Descriptor: m nl Fnl 3 m 4 f ( x) Z ( x) dx. nl x 1 m l ml ( m nl ) 2 A surface representation of 1ew0A (A) is reconstructed from its 3D Zernike invariants of the order 5, 10, 15, 20, and 25 (B-F). (Sael & Kihara, 2009) 5

Protein Residue Environment SCOre (PRESCO) Center along the main-chain within a sphere of 6 or 8 Å (Kim & Kihara, Proteins 2014) 6

Finding Similar Side-Chain Depth Environment (SDE) from a database Query SDE surface Structure Database 2536 proteins 500 lowest RMSD fragments of 9 side-chain centroids; Superimposed with the query fragment Select SDE with the same number of side-chain centroids in the sphere of 8.0Å Compute RMSD of residuedepth for corresponding side-chain centroids Sort by depth RMSD to the query 7

CASP11 Free Modeling Category Ranking (Model 1) (http://www.predictioncenter.org/casp11/zscores_final.cgi?formula=assessors) (Kim & Kihara, Proteins 2015) 8

DFIRE, GOAP, ITScore Scoring Functions DFIRE (Yaoqi Zhou): statistical distancedependent atom contact potential using the finite ideal-gas reference state GOAP (Jeff Skolnick): DFIRE * orientation dependent term ITScore (Xiaoqin Zou):iteratively refined statistical distance-dependent atom contact potential 9

The BindML Algorithm (La D, & Kihara D, Proteins 2012) 10

Generating Substitution Models ipfam (505 Families) Model Model 11

ipfam Dataset Benchmark ROC based on 449 Protein Complexes 12

BindML Webserver http://kiharalab.org/bindml (Wei Q, La D, & Kihara D, Methods in Mol.Biol. In press 2016) 13

T79 (Round 30) (Interface 2) Kihara: 3 hits; LZerD: 1 hit Homodimer LZerD runs: No-interface prediction With BindML-consPPISP prediction LZerD selection strategy: Consensus of ITScore and GOAP 5 from no-interface, 5 from BindML-consPPISP Kihara selection strategy: Manual combination of ITScore, GOAP, DFIRE, and PRESCO 10 from no-interface 14

T79 Subunit Model Quality Chain A RMSD: 4.0 Å Chain B RMSD: 4.0 Å native model 15

T79 Human Selected Model fnat 0.16, L-RMSD 14.1Å, i-rmsd 3.8 Å native model 16

T79 Interface Prediction Method Precision Recall F-Score BindML 0 0 NA Cons-PPISP 0.10 0.18 0.12 17

irmsd LRMSD fnat T79 Scores (no-interface prediction) GOAP DFIRE ITScore 18

ITScore DFIRE GOAP T79 Score Comparison GOAP DFIRE ITScore 19

lrmsd T79 PRESCO scores With Inteface Prediction Without Interface Prediction PRESCO PRESCO 20

T79 Score performance summary Run Score RFH Hits in top 10 nointerface ITScore 1 (62) 3 nointerface GOAP 1 (72) 3 nointerface DFIRE 1 (111) 5 BindMLconsPPISP all - - RFH: rank of first acceptable (medium) hit 21

T91 (Round 30) Kihara: 8 hits; LZerD: 2 hits Homodimer LZerD runs: No-interface prediction (with our monomer model) With BindML+consPPISP interface prediction Zhang1 CASP server model, no-interface prediction Server selection strategy 10 from no-interface Human selection strategy Consensus of ITScore, GOAP, PRESCO, and visual inspection 5 from no-interface, 5 from Zhang1 22

T91 Subunit Models Chain C Our model: RMSD 6.0 Å Zhang: RMSD 4.9 Å Chain D Our model RMSD 6.5 Å Zhang: RMSD 5.7 Å native Our model Zhang1 23

T91 Human Selected Model model native fnat 0.33, L-RMSD 9.0 Å, I-RMSD 4.2 Å 24

T91 Interface Prediction Method Precision Recall F-Score BindML 0.64 0.20 0.30 Cons-PPISP 0.50 0.28 0.36 25

irmsd LRMSD fnat T91 Score (no interface prediction) GOAP DFIRE ITScore 26

irmsd LRMSD fnat T91 Scores (With Interface prediction) GOAP DFIRE ITScore 27

T91 Scores (Zhang models) LRMSD fnat irmsd GOAP DFIRE ITScore 28

ITScore DFIRE GOAP T91 Zhang1 Score Comparison GOAP DFIRE ITScore 29

LRMSD T91 PRESCO Scores Docking with Zhang models Without Interface Prediction PRESCO PRESCO Top 5 models selected from each 30

T91 Score Performance Summary Run Score RFH Hits in top 10 nointerface ITScore 2 2 nointerface GOAP 2 1 nointerface DFIRE 1 2 interface ITScore 1042 0 interface GOAP 165 0 interface DFIRE 116 0 zhang1 ITScore 1 (4) 5 zhang1 GOAP 2 (16) 5 zhang1 DFIRE 1 (6) 6 RFH: rank of first acceptable (medium) hit 31

T96 (Round 31) Heterodimer Predictor hits: 0 (5 by other groups) Scorer hits: human 1, server 0 (1 by other group) Human: 6 selected by PRESCO, 4 selected from with predicted interface, ITScore, GOAP, DFIRE No PDB file for the native structure available: metrics computed using two scorer hits (average L-RMSD/I-RMSD, max fnat) 32

T96 scorer hits Chain B S31.M06 (Kihara) fnat 0.32 L-RMSD 7.99 Å I-RMSD 2.67 Å Chain B S39.M03 (Haliloglu) fnat 0.22 L-RMSD 5.68 Å I-RMSD 2.44 Å Chain A 33

T96 interface prediction Chain Method Precision Recall F-score A BindML 0.15 0.2 0.17 Cons-PPISP 0 0 NA B BindML 0.12 0.11 0.12 Cons-PPISP* NA NA NA *Cons-PPISP predictions were only for the N-terminal tail; visual inspection suggests that N-terminal tail is not a likely a binding site, so these predictions were not used. 34

irmsd lrmsd fnat T96 Scorer-Models Scores GOAP DFIRE ITScore 35

T96 Score Performance Summary Score RFH Hits in top 10 ITScore 529 0 GOAP 6 1 DFIRE 125 0 RFH: rank of first acceptable hit The hit for GOAP/DFIRE is the same model picked by PRESCO 36

Summary Our docking prediction procedure runs LZerD, and decoys were selected by combining DFIRE, ITScore, GOAP, and PRESCO. Binding sites were predicted by BindML and cons-ppisp. On the examples shown, PRESCO s performance was not as spectacular as we expected from its performance on single chain str. prediction. DFIRE, ITScore, GOAP showed similar, reasonably good performance. Scoring functions performance depends on subunit model quality. The way to use BindML prediction needs to be improved. 37

Lab Members Hyung- Rae Kim Lenna Peterson @kiharalab 38