NMR-Structure determination with the program CNS

Similar documents
HOMOLOGY MODELING. The sequence alignment and template structure are then used to produce a structural model of the target.

Introduction to Comparative Protein Modeling. Chapter 4 Part I

NMR, X-ray Diffraction, Protein Structure, and RasMol

Viewing and Analyzing Proteins, Ligands and their Complexes 2

Preparing a PDB File

Molecular Visualization. Introduction

Pymol Practial Guide

Examples of Protein Modeling. Protein Modeling. Primary Structure. Protein Structure Description. Protein Sequence Sources. Importing Sequences to MOE

Protein Bioinformatics Computer lab #1 Friday, April 11, 2008 Sean Prigge and Ingo Ruczinski

Molecular Modeling Lecture 7. Homology modeling insertions/deletions manual realignment

Ramachandran Plot. 4ysz Phi (degrees) Plot statistics

Molecular Modeling lecture 2

Bulk behaviour. Alanine. FIG. 1. Chemical structure of the RKLPDA peptide. Numbers on the left mark alpha carbons.

ISIS/Draw "Quick Start"

Protein Structure Determination Using NMR Restraints BCMB/CHEM 8190

SeeSAR 7.1 Beginners Guide. June 2017

Chem 253. Tutorial for Materials Studio

3. An Introduction to Molecular Mechanics

3. An Introduction to Molecular Mechanics

Assignment 2: Conformation Searching (50 points)

PROTEIN'STRUCTURE'DETERMINATION'

Section III - Designing Models for 3D Printing

NMR Predictor. Introduction

Performing a Pharmacophore Search using CSD-CrossMiner

Conformational Geometry of Peptides and Proteins:

IFM Chemistry Computational Chemistry 2010, 7.5 hp LAB2. Computer laboratory exercise 1 (LAB2): Quantum chemical calculations

Homology Modeling (Comparative Structure Modeling) GBCB 5874: Problem Solving in GBCB

Molecular Modeling and Conformational Analysis with PC Spartan

Introduction to Structure Preparation and Visualization

Bonds and Structural Supports

Part 7 Bonds and Structural Supports

QUANTA Protein Design MAY 2006

Practical Manual. General outline to use the structural information obtained from molecular alignment

Exercises for Windows

Assignment 1: Molecular Mechanics (PART 1 25 points)

BMB/Bi/Ch 173 Winter 2018

Supporting Protocol This protocol describes the construction and the force-field parameters of the non-standard residue for the Ag + -site using CNS

Example: Identification

Introduction Molecular Structure Script Console External resources Advanced topics. JMol tutorial. Giovanni Morelli.

3D - Structure Graphics Capabilities with PDF-4 Database Products

Chemistry 14CL. Worksheet for the Molecular Modeling Workshop. (Revised FULL Version 2012 J.W. Pang) (Modified A. A. Russell)

Molecular modeling with InsightII

Secondary and sidechain structures

Version 1.2 October 2017 CSD v5.39

LS1a Fall 2014 Problem Set #2 Due Monday 10/6 at 6 pm in the drop boxes on the Science Center 2 nd Floor

Using Microsoft Excel

Computational Structural Biology and Molecular Simulation. Introduction to VMD Molecular Visualization and Analysis

CAP 5510 Lecture 3 Protein Structures

Timescales of Protein Dynamics

Timescales of Protein Dynamics

Creating a Pharmacophore Query from a Reference Molecule & Scaffold Hopping in CSD-CrossMiner

Figure 1. Molecules geometries of 5021 and Each neutral group in CHARMM topology was grouped in dash circle.

Build_model v User Guide

Comparing whole genomes

General Chemistry Lab Molecular Modeling

Exercise 2: Solvating the Structure Before you continue, follow these steps: Setting up Periodic Boundary Conditions

Ligand Scout Tutorials

Building small molecules

Online Protein Structure Analysis with the Bio3D WebApp

Course Notes: Topics in Computational. Structural Biology.

Let s continue our discussion on the interaction between Fe(III) and 6,7-dihydroxynaphthalene-2- sulfonate.

3D Molecule Viewer of MOGADOC (JavaScript)

QUANTA. Protein Design. Release December Scranton Road San Diego, CA / Fax: 858/

Patrick: An Introduction to Medicinal Chemistry 5e MOLECULAR MODELLING EXERCISES CHAPTER 17

Flexibility and Constraints in GOLD

ICM-Chemist-Pro How-To Guide. Version 3.6-1h Last Updated 12/29/2009

Assignment 2 Atomic-Level Molecular Modeling

Can protein model accuracy be. identified? NO! CBS, BioCentrum, Morten Nielsen, DTU

Calculating Bond Enthalpies of the Hydrides

NMR in Structural Biology

Procheck output. Bond angles (Procheck) Structure verification and validation Bond lengths (Procheck) Introduction to Bioinformatics.

Full wwpdb NMR Structure Validation Report i

BUILDING BASICS WITH HYPERCHEM LITE

BCMB/CHEM 8190 Lab Exercise Using Maple for NMR Data Processing and Pulse Sequence Design March 2012

Computational Protein Design

Dihedral Angles. Homayoun Valafar. Department of Computer Science and Engineering, USC 02/03/10 CSCE 769

Electric Fields and Orbitals Teacher s Guide

CHEM 463: Advanced Inorganic Chemistry Modeling Metalloproteins for Structural Analysis

NMR Assignments using NMRView II: Sequential Assignments

Chapter 1: NMR Coupling Constants

Problem Set 1

ANALYZE. A Program for Cluster Analysis and Characterization of Conformational Ensembles of Polypeptides

I690/B680 Structural Bioinformatics Spring Protein Structure Determination by NMR Spectroscopy

Protein structure analysis. Risto Laakso 10th January 2005

Conformational Analysis of n-butane

Docking with Water in the Binding Site using GOLD

Dock Ligands from a 2D Molecule Sketch

Physical Chemistry Analyzing a Crystal Structure and the Diffraction Pattern Virginia B. Pett The College of Wooster

Identifying Interaction Hot Spots with SuperStar

1 Introduction. command intended for command prompt

Application Note. U. Heat of Formation of Ethyl Alcohol and Dimethyl Ether. Introduction

Refine & Validate. In the *.res file, be sure to add the following four commands after the UNIT instruction and before any atoms: ACTA CONF WPDB -2

Introduction to Hartree-Fock calculations in Spartan

Lewis Structures and Molecular Shapes

Lewis Structures and Molecular Shapes

Physiochemical Properties of Residues

From Amino Acids to Proteins - in 4 Easy Steps

7 Infrared, Thermochemistry, UV-Vis, and NMR

Assignment #0 Using Stellarium

1) NMR is a method of chemical analysis. (Who uses NMR in this way?) 2) NMR is used as a method for medical imaging. (called MRI )

Transcription:

NMR-Structure determination with the program CNS Blockkurs 2013 Exercise 11.10.2013, room Mango? 1

NMR-Structure determination - Overview Amino acid sequence Topology file nef_seq.mtf loop cns_mtf_atom.id _cns_mtf_atom.segment_id _cns_mtf_atom.residue_id _cns_mtf_atom.residue_name _cns_mtf_atom.atom_name _cns_mtf_atom.chemical_type _cns_mtf_atom.charge _cns_mtf_atom.atom_mass 1 ' 19 ALA CA CH1E 0.220000 12.0110 2 19 ALA HA HA 0.100000 1.00800 3 19 ALA CB CH3E -0.300000 12.0110 4 19 ALA HB1 HA 0.100000 1.00800 Extended structure nef_ext.pdb NMR restraints noe.tbl, hbonds.tbl, simulated annealing calculation using protocol anneal.inp assign (resid 19 and name HA ) (resid 20 and name HN ) 2.1 0.3 1.0 assign (resid 19 and name HB# ) (resid 20 and name HN ) 2.3 0.5 1.5 assign (resid 19 and name HA ) (resid 22 and name HB# ) 3.0 1.2 1.0 assign (resid 20 and name HA ) (resid 21 and name HA ) 4.4 2.6 1.0 Calculated structure nef_calculated.pdb: good/bad structure? angles, distances, energies etc okay? 2

Goals of the course: A. Calculate a structure of the protein HIV-NEF by simulated annealing using the program CNS. B. Evaluate the calculated structures. C. Visualize the result. First steps in the LINUX world 1. Log in with your unibas e-mail account name and password. 2. To see in which directory you are: pwd 3. To see the content of the directory: ls al 4. To change between directories: one down cd directory-name ( e.g. cd input_data ) one up cd.. 5. To edit and look at the content of a file: if you are in the same directory e file-name ( e.g. e noe.tbl ) else e input_data/noe.tbl A.) Structure Calculation 0. Files: find the molecular topology file nef_seq.mtf, the extended structure nef_ext.pdb and the simulated annealing protocol anneal.inp in the directory data. Find the experimental NMR data in the directory input_data. 1. Start the structure calculation with the command cns < anneal.inp > 3d_structure.log This is the syntax starting the simulated annealing protocol anneal.inp in the program CNS where the status of the structure calculation is saved in the text file 3d_structure.log. It takes CNS some 10 minutes per structure calculation (time limiting step here). 3

2. Open a new terminal and look at the input file anneal.inp and the files nef_seq.mtf and nef_ext.pdb with a text editor. Questions: I. input file 1.) Find the molecular topology file and the extended starting structure as inputs for the structure calculation. 2.) Understand how the temperature and force constants will be modified in the annealing steps (high kinetic and low van der Waals-energies in the beginning allow to avoid energetically trapped funny structures). 3.) Which experimental information is used in the structure calculation? (remark: Files with experimental data carry the extension.tbl. Look at the respective files in the directory input_data. 4.) What is the output name for the calculated coordinates? Important parameters for the constitution of the molecule are: the sequence of the protein and the numbering of the amino acid residues, the attachment of protons (NMR/Xray!), the building of disulfide bridges (if any), the attachment of prosthetic groups (if any) and the output-file. II. molecular topology file It has been generated from the sequence and CNS parameters only by a separate script and contains the following information: atom names, types, charges, and masses; residue names and the lists of all bonds, angles, dihedral angles and improper angles between the specified atoms in the covalent structure. 5.) Which is the first residue of NEF used in the structure determination? 6.) How many atoms define an angle, a dihedral angle and an improper angle? III. extended PDB structure Beside of the names and numbers of the atoms and residues it contains the Cartesian coordinates of every atom given in x, y, z direction in Å. The last two columns are not used in NMR-structure determination. 7.) Which residues are deleted in the present structure of NEF? 8.) Which is the largest extension of the protein, how long is it? IV. Calculated PDB files: 1.) Check if your first structure(s) has/have been calculated. 2.) Estimate the new distance between the N- and C-terminus of one of your structures. 3.) At which 'temperature' was your calculation performed. The '.pdb'-files contain the coordinates for the different structures calculated 4

together with information about the calculation. The 'energy' is a measure for the overall deviation (overall energy) of the structure from the one, which would be in agreement with all the measured restraints (distances and dihedral angles) and also with all stereochemical and covalent constraints. A low value indicates good agreement. To find out which of your structures is the best you can use the unix-command grep overall *.pdb sort -nr -k4 or grep 'noe ' *.pdb sort -nr -k4 to get a sorted list of all the values on the screen. 4.) Which is the energetically most favorable structure? 5.) Give the energy ranges found for the various parameters, stereochemical and data restraints. B.) The Evaluation To obtain a more general quality criterion for the calculated structures we will assess how unusual the geometry of the residues in a given protein structure is, as compared to stereochemical parameters derived from well-refined, highresolution structures and classic model-building of the early days. You will run a stereochemical analysis both on a single structure (using procheck) and on an ensemble of structures (using procheck_nmr). 1. To facilitate data evaluation you should use another directory for these calculations, change to the one called check, within the data directory 2. To run procheck, type the following: cd check procheck../filename.pdb 2.5 You have to specify the coordinate file together with the proper directory information and the expected resolution (in Angstroem) of the structure, e.g. 2.5. 3. To run procheck_nmr you only need a coordinate file. However, this.pdb file has to be a multi-structure file, i.e. it contains all the separately calculated structures of the set. This file is created by: joinpdb o output.pdb You will be prompted to specify the root of the input file (root of../file_1.pdb,../file_2.pdb is../file_ ) and the total number of files with this root ( to see them type: ls root* ). procheck_nmr output.pdb 5

4. Procheck and procheck_nmr generate a number of output files, which have the same names as the original coordinate file but different extensions. The relevant results are in postscript format (filename_01.ps, filename_02.ps ) which can be displayed with the tool gs (ghostscript) or sent to the printer gs filename.ps lp filename.ps The description of these files is given in the Appendix (pages 8-10). Questions procheck & procheck_nmr: 1.) How can the quality of a structure be deduced from the ramachandran plot? 2.) To which resolution of a crystal structure does the best structure you calculated correspond? 3.) Which secondary structure elements are present in NEF? 4.) Are the number and size of secondary structure elements related to the quality of the structure? C.) The Visualization The standard measure to quantify the difference between two 3D-structures is the rmsd (root mean square deviation) for a given set of corresponding atoms, e.g. the C-alpha atoms of the structures. Visualization and alignment can be done with the pdb-viewer spdbv. The pdb-viewer program is started by typing spdbv Further information is available in the appendix (attached to this script), or as a web page (open the web browser installed here and look for spdbv ). To detect the parts of the protein which are especially flexible, it is illuminating to look at all or several structures of an ensemble, calculated with the same parameters. Questions: 1.) Which parts of the structure show the largest flexibility? Specify the ranges. 2.) How large (in terms of rmsd) are the differences between the structures? 6

Appendix A. Experimental Data The distance and angular restrictions deduced from the data of different NMRexperiments are the basis of the structure calculation. The key NMR-data carrying structure information are the following: Distance information in the range of 2-5 Å between protons measured as NOE-intensities (noe.tbl). The cross-peak intensities between the protons are directly related to the distance between the atoms (dipole-dipole, 1/r 6 ). Distance information between protons and oxygens (H-bonds < 2.6 Å), indicated through a slow hydrogen deuterium exchange (hbonds.tbl) and can now also be determined by direct measurement of scalar J-couplings between N-H.. O=C. Angle information about the peptide angles, as measured by the 3 bond J- coupling between H -H N (phij.tbl). The size of the coupling is related to the value of the angle between C H and NH N through the Karplus relation. Angle information for the peptide,angles can be deduced from the carbon chemical shifts of C and C (cacb.tbl) which depend on the backbone conformation. Angle restraints for all dihedral angles of the amino acid residues, obtained from other measurements and specifications of angular ranges, deduced from known structural parameters of high resolution structures (dihed.tbl). The structure calculation is performed using this information either directly by transforming the distance information between atoms to 3D-coordinates ( distance geometry ) or by restrained molecular dynamics calculations. The drawback of the direct approach is that it will only result in an unambiguous structure if all distances are known. Therefore, a refinement including the distance and angular information is always necessary. Molecular dynamics calculations result in a 3D structure by starting with extremely large random changes in the molecule geometry (high temperature phase). The final conformation will be reached by reducing the flexibility of the molecule stepwise (annealing phase) and at the same time increasing the influence of the experimentally determined distance and angle information (the force constant of the restraints is increased). This simulated annealing resembles the natural folding process which is artificially accelerated by variable force constants and temperatures as the natural folding process is hardly accessible by computational methods to date. In this way the chance of obtaining a structure in best agreement with the input data (i.e. with the lowest energy) is much higher than simply trying to minimize the deviations of a structure to the distance and angular restraints in a nonlinear fitting procedure. To illustrate the method of simulated annealing the structure of the protein NEF (19 kda, however, after flexible loop and N-term deletion 137 amino acids) will be calculated during this course. APPENDIX 7

B. The procheck and procheck_nmr output files: TheRamachandran plot (.._1.ps / 'nmr'.._1.ps [ output of procheck / output of procheck_nmr ] ) shows the,torsion angles for all residues in the structure or the ensemble of structures. Glycines are separately identified by triangles as they lack a sidechain and are thus not restricted to the regions of the plot appropriate to the other residues. The red area corresponds to the "core" regions representing the most favorable combinations of,values. The percentage of residues in the "core" regions is one of the better guides to structural quality. The equivalent resolution (.._2.ps / 'nmr'.._3.ps) evaluated in six/five graphs of the parameters shows how the structure (represented by the solid square) compares with well-refined structures at a similar resolution. The properties plotted by procheck are: a) Ramachandran plot quality. This property is measured by the percentage of residues in the most favored regions of the Ramachandran plot. As the structural quality gets poorer, this value decreases. b) Peptide bond planarity. c) Bad non-bonded interactions. This property is measured by the number of bad contacts per 100 residues. Bad contacts are defined as contacts where the distance of closest approach is less than or equal to 2.6 Å. d) C tetrahedral distortion. This property is measured by calculating the standard deviation of the zeta torsion angle which is defined by the following four atoms within a given residue: C, N, C, and C. e) Main-chain hydrogen bond energy. This property is the standard deviation of the hydrogen bond energies for main-chain hydrogen bonds. f) Overall G-factor. The overall G-factor is a measure of the overall normality of the structure, compared to known high resolution structures. The properties plotted by procheck_nmr are: a) Ramachandran plot quality assessment. b) Main-chain hydrogen bond energies. c) 1 pooled standard deviations - giving the standard deviations of the 1 values from their nearest favoured conformation (i.e. gauche minus, trans and 1 gauche). d) Standard deviation of 2 trans angle - giving the standard deviation of only the trans 2 dihedral angles. e) Overall G-factor. The residue properties / ensemble properties (.._3.ps / 'nmr'.._2.ps) show how the protein's geometrical properties vary along its sequence. This is a visualization of which regions have consistently poor or unusual geometry (maybe because they are poorly defined) and which have a rather normal geometry. The properties plotted by procheck are: APPENDIX 8

a) Absolute deviation from mean "ideal" 1 value (excl. Prolines). b) Absolute deviation from mean "ideal" of torsion. c) C chirality: For each graph, unusual values (usually those more than 2 standard deviations away from the "ideal" mean value) are highlighted. d) Secondary structure & average estimated accessibility. The shading behind the schematic secondary structure assignments gives an approximation to the residue accessibilities, i.e. which parts of the structure are buried and which are exposed on the surface. e) Sequence & Ramachandran regions. The graph shows the sequence of the structure (using the one-letter amino-acid codes) and a set of markers that identify the region of the Ramachandran plot in which each residue is located. f) Max. deviation. The histogram of asterisks and plus-signs shows each residue's maximum deviation from one of the ideal values given on the residue-by-residue listing in the.out file (the respective parameter is given in the final column). g) G-factors. The shaded squares give a schematic representation of each residue's G- factor values for its and 1-2 and 1 dihedral angles. (Note that the 1 G-factors are shown only for those residues that do not have a 2 ). Regions with many dark squares correspond to regions where the properties are "unusual", as defined by a low (or negative) G-factor. These may correspond to highly mobile or poorly defined regions such as loops, or may need further investigation. The properties plotted by procheck_nmr are: a) Absolute deviation of the residue's 1 torsion angle from the nearest "ideal" value. b) Absolute deviation of omega torsion angle from the "ideal". For each procheck_nmr graph, the values for each residue in each model are plotted as individual crosses. The mean and standard deviation values for a given residue are indicated by the circle and bars, respectively. The model whose value is the highest is marked by the model-number printed. The dashed line corresponds to 2.0 standard deviations from the ideal, so an excess of points above this line suggests possible problems with the geometry. c) RMS deviation from mean coordinates. d) The histogram shows the rms deviations of the main-chain atoms (black bars) and the sidechain atoms (grey bars). The mean coordinates are calculated simply by averaging each atom's coordinates across the whole ensemble. e) Secondary structure & average estimated accessibility. f) Sequence & average estimated accessibility. The darker the symbol the higher the accessibility. g) Circular variances. The dials give a schematic representation of each residue's circular APPENDIX 9

variance values for its,, 1 and 2 angles, and for its and 1, 2 combinations. The larger the black segment on the dial, the higher the circular variance, and hence the wider the spread of the corresponding dihedral angle distribution. These regions of the structure may correspond to highly mobile or poorly defined regions, such as loops, or may need further investigation. h) G-factors. The Model-by-model (- / 'nmr'.._4.ps) secondary structure assignment for each of the models in the ensemble. These can be useful to see which models possess the expected secondary structure elements, and which may have lost the requisite hydrogen-bonding patterns. APPENDIX 10

C. Basic description of the pdb-viewer : spdbv (PDB-viewer) File: open PDB-file / select - open close quit Display: backbone oxygen / toggle on/off protons / toggle on/off Window: Fit: control panel / shift-left_mouse in third column: side-chains off / shift-left_mouse in last column: color menu for the chain / in the very top file_selection with left_mouse ramachandran plot / select the residues by name with left_mouse / close window by clicking on the X in the right windowframe corner / Best, to get rms-deviation between the structures First the menu window opens with the following symbols and pulldown menus: Of the pulldown menus only FILE FIT DISPLAY WIND will be used. To open a '.pdb'-file click FILE and select the submenu 'open PDB-file'. Then after you see the proper directory (chosen by either going up one directory level - clicking '.. ' or going down - clicking on the desired directory name) click on the filename. After a moment two new windows will open. One (with text) is simply a log-file and can be closed immediately by clicking on the X in the right corner of the windowframe. The second window should display the molecule. To display more structures at the same time repeat this step, but first open only one and become familiar with the following model manipulations: The molecule can be rotated by pressing the left_mouse_button and moving the mouse, translated by pressing both_mouse_buttons and zoomed (variation in size) by pressing the right_mouse_button. APPENDIX 11

The appearance of the molecule can be changed in two ways: 1) In the menu DISPLAY, removing the protons and the carbonyl-o of the backbone by clicking on the respective submenus. 2) With the 'Control Panel', a new window to be opened in the menu WIND: 3) In the 'Control Panel' the sidechains can be removed (and later re-introduced) by clicking on the third column with shift-pressed and left_mouse. 4) The backbone can be removed by clicking in the second column with shift-pressed and left_mouse. 5) Amino acid residues can be selected by click on the first column with shift-pressed and left_mouse. 6) A sketch with secondary structure elements can be generated by clicking on the column before the last with shift-pressed and left_mouse. 7) The colour of the model can be changed by clicking on the last column with shift-pressed and left_mouse and then adjusting the colour in the appearing window (drag the upper slider in the second triplet to the right and the ones below it to the left to get 'red'). 8) In addition every residue can be manipulated individually by clicking at the height of the residue on the respective column. 9) The top part of the 'Control Panel' allows one to select between different models loaded by clicking on it (in the image the file is named 1GDI) and sliding with the left_mouse-pressed to the desired filename. Alternatively open Layer infos in the medu WIND and select/deselect the different models there. With Layer infos one can also switch on/off atoms (protons, backbone oxygens etc.). 10) In the row below the 'model-selection-part' one can make the selected model visible or invisible by clicking at the left part of this row. A label can be attached to a specific residue (to get the type and residue number) of the model displayed by first select the eighth symbol from the left (LEU41) in the menu-window and then click on the desired part of the model. The full mouse-control of the model-movements can be obtained by click on the fourth symbol from the left. For the alignment of two structures use the menu FIT and select 'Best (with Struc, Align)', this procedure will display additionally the rmsd (the measure of the APPENDIX 12

deviation) between the two specified structures in the last row of the menuwindow. A Ramachandran Plot can be created by selecting this topic in the menu WIND. Either only a few or all residues of the active model can be selected in 'Control Panel' to be displayed in the plot by sliding across the first colum with left_mousepressed (from top to bottom if all should be selected). This window can be closed by clicking at the X in the right corner of the window-frame. Finally the structures can be removed in the menu FILE, selecting 'close' for one after the other closing the windows and removing the structures loaded. In the same way the program can be closed completely by selecting 'quit'. APPENDIX 13

D. Summary of commands : switch to LINUX ctrl ctrl b return login user unilogon password unilogon Basic UNIX: pwd cd directory-name cd.. cd dir_name ls -al grep text filename more filename tail filename e filename nedit filename pulldown menu FILE : gives the current directory-name change directory - up - down see the content of the directory select all lines which contain the expression 'text' in the file filename scroll through a file look at the end of a file start a text editor start a text editor to open, save or quit sort -n k4 filename sorts according to value of a specified column -n numeric -r reverse -k4 4 th column sort -nr -k4 filename sorts filename reversely to the value of column 4 gs filename.ps after 'gs' has been started: lpr filename.ps lp filename.ps to display postscript files hit return for next page, type quit to exit to print postscript files to print postscript files on sgi computer cns < filename.inp > filename.log for example cns < gen_seq2.inp > gen_seq.log APPENDIX 14

procheck filename.pdb resolution example procheck nef_ext.pdb 2.5 joinpdb -o output_file.pdb (the program asks then for the rootname of the input files [ like 'nef_' for nef_1.pdb nef_2.pdb... ] and the total number of input files) procheck_nmr filename.pdb [select.dat] (select.dat is optional and contains the number of those files which should be used) example procheck_nmr nef_tot.pdb APPENDIX 15