nmrq: a web server for the validation, comparison and

1
NMRQ: A Web Server for the Validation, Comparison and Analysis of Protein Structures Solved by NMR Gary Van Domselaar , Paul Stothard, Trent Bjorndahl, Steve Neal, Mark Berjanskii, and David S. Wishart Department of Computing Science and Biological Sciences University of Alberta Edmonton AB T6E 2E9 [email protected] [email protected] Chemical Shift Report NMRQ uses the model coordinates in the ensemble to predict the backbone chemical shifts, and compares the predictions to the supplied experimental chemical shift data. (a) NMRQ presents the predicted- vs- observed data as a scatterplot for each model. Cumulative and average chemical shift scatterplots are reported for the ensemble. Outliers indicate possible misassignments. NMRQ compares the correlation coefficient to a database of high- quality NMR structures to provide the chemical shift 'R- factor': a measure of the quality of the fit between model and experimental data. (b) A plot comparing Chemical Shift Index with secondary structure. NOE Report (Under Development) NMRQ uses Aqua [4] to report NOE violations, NOEs/ Residue, and NOE completeness. NMRQ also identifies the long- range NOEs in the well- ordered regions of the ensemble. These NOEs are considered to be the most important for folding the protein structure. The 'expected important NOEs' are compared to the observed NOEs and a quality score derived from the degree of their agreement. References 1. Lovell SC, Davis IW, Arendall WB 3rd, de Bakker PI, Word JM, Prisant MG, Richardson JS, and Richardson DC (2003) Proteins 50 :437- 50. 2. Nederveen AJ, Doreleijers JF, Vranken W, Miller Z, Spronk CA, Nabuurs SB, Guntert P, Livny M, Markley JL, Nilges M, Ulrich EL, Kaptein R, and Bonvin AM (2005) Proteins 59 :662- 72. 3. Morris AL, MacArthur MW, Hutchinson EG and Thornton JM. (1992) Proteins. 12 :345- 364. 4. Laskowski RA, Rullmann JAC., MacArthur MW, Kaptein R and Thornton JM (1996) J. Biomol. NMR 8 :477- 486. Volume/ Area Report (a) NMRQ uses Voronoi polyhedra to calculate residue volumes. Unusual volumes suggest possible packing defects. (b) Fractional accessible surface areas reveal energetically unfavorable regions. Abstract NMRQ is a generalized web server designed to assess and visualize the quality of NMR-derived protein structures and NMR- derived chemical shift assignments. It performs five key functions including (i) assessment of the quality/correctness and disposition of chemical shift assignments, (ii) assessment of the structure ensemble quality (via superposition and RMSD analysis), (iii) assessment of the correctness of NOE restraints, (iv) assessment of overall structure quality, and (v) the identification of key structural features (via multiple geometric and heuristic measures). It accepts chemical shifts, PDB coordinates and NOE restraint tables as input. From these NMRQ performs extensive, unbiased and fully automated analyses and generates high quality tables, graphs and color images of both collective and individual structure/ chemical shift properties and quality scores. Additionally we introduce the concept of the chemical shift R factor and demonstrate how chemical shifts can serve as independent, orthogonal measures of structure quality relative to standard NOE assessments. The NMRQ server is freely available at http:/ / wishart.biology.ualberta.ca/ NMRQ Summary Reports NMRQ identifies anomalies, outliers, and other peculiar properties at the residue, model, and ensemble levels. These possible problems are presented as Summary Reports. (a) Residue Summaries: Possible problems are tabulated for each residue. Residue reports also include per- residue plots (per- residue Ramachandran plots, chi-distribution, chi1-chi2 normality plots, etc) and tabular summaries of the various analyses. (b) Model Summaries: The problematic residues for each model are shaded red. The darkness of the shading corresponds to the number of identified problems per residue. The identified anomalies for each problematic residue are tabulated below the model. (c) Ensemble Summaries: Residues with persistent problems, i.e. that appear in a majority of the models, are mapped to a representative model for the ensemble. Model and Ensemble quality are scored by comparing to a database of high quality X-ray structures (Richardson top 500 [1]) and NMR structures (RECOORD [2]). The scores are mapped to equivalent resolution and plotted as scatterplots (in development). Superposition Report NMRQ features a superpositioning algorithm that can automatically identify the well- ordered regions of the ensemble and restrict the superposition to these regions. A table (not shown) reports the C- alpha, backbone, heavy atom, and all- atom RMSDs (to average) for the well- ordered regions and for all residues. (a) The superpositioned ensemble is rendered in stereo- view and coloured by model. A hyperlinked legend facilitates drill- down to the individual models. (b) The ensemble is presented with shading to indicate the degree of local RMSD. (c) A per- residue RMSD plot shows local RMSD variability. The secondary structure overlay can be used to examine the correspondence of secondary structure elements and RMSD. a b c a b c a Statistics Report C ombinations of torsion angle, volume, and accessible surface area values are combined to create (a) Stereo/ Packing quality plots, and (b) 3D Profile Quality plots. Many other statistics (not shown) are also reported here. Statistics a Torsion Angle Report NMRQ reports on the quality of φ , ψ , χ and ω torsion angles. (a) The phi+psi variability plot highlights the ordered and disordered regions of the ensemble. (b) The Ramachandran plot quality can be assessed using the φ , ψ distributions derived from the Richardson Top 500 [1] , RECOORD [2], or Morris et al . [3]. Tables display the residues in disallowed region of the plot. Not shown but included are χ 1 dispersion plots, χ 1 - χ 2 normality plots χ 1 angle favorability plots, and χ 1 - Accessible Surface Area covariance plots. b a a b b b

Upload: others

Post on 19-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

NMRQ: A Web Server for the Validation, Comparison and Analysis of Protein Structures Solved by NMR

Gary Van Domselaar†, Paul Stothard, Trent Bjorndahl, Steve Neal, Mark Berjanskii, and David S. Wishart‡

Department of Computing Science and Biological SciencesUniversity of Alberta

Edmonton AB T6E 2E9†[email protected][email protected]

Chemical Shift Report

NMRQ uses the model coordinates in the ensemble to predict the backbone chemical shifts, and compares the predict ions to the supplied experimental chemical shift data. (a) NMRQ presents the predicted- vs- observed data as a scatterplot for each model. Cumulative and average chemical shift scatterplots are reported for the ensemble. Outliers indicate possible misassignments. NMRQ compares the correlat ion coeff icient to a database of high-quality NMR structures to provide the chemical shift 'R- factor': a measure of the quality of the f it between model and experimental data. (b) A plot comparing Chemical Shift Index with secondary structure.

NOE Report (Under Development)

NMRQ uses Aqua [4] to report NOE violat ions, NOEs/ Residue, and NOE completeness. NMRQ also identif ies the long- range NOEs in the well-ordered regions of the ensemble. These NOEs are considered to be the most important for folding the protein structure. The 'expected important NOEs' are compared to the observed NOEs and a quality score derived from the degree of their agreement.

References

1. Lovell SC, Davis IW, Arendall WB 3rd, de Bakker PI, Word JM, Prisant MG, Richardson JS, and Richardson DC (2003) Proteins 50 :437- 50.

2. Nederveen AJ, Doreleijers JF, Vranken W, Miller Z, Spronk CA, Nabuurs SB, Guntert P, Livny M, Markley JL, Nilges M, Ulrich EL, Kaptein R, and Bonvin AM (2005) Proteins 59 :662- 72.

3. Morris AL, MacArthur MW, Hutchinson EG and Thornton JM. (1992) Proteins. 12 :345- 364.

4. Laskowski RA, Rullmann JAC., MacArthur MW, Kaptein R and Thornton JM (1996) J. Biomol. NMR 8 :477- 486.

Volume/ Area Report

(a) NMRQ uses Voronoi polyhedra to calculate residue volumes. Unusual volumes suggest possible packing defects. (b) Fract ional accessible surface areas reveal energetically unfavorable regions.

AbstractNMRQ is a generalized web server designed to assess and visualize the quality of NMR- derived protein structures and NMR- derived chemical shift assignments. It performs five key functions including (i) assessment of the quality/ correctness and disposit ion of chemical shift assignments, (ii) assessment of the structure ensemble quality (via superposit ion and RMSD analysis), (iii) assessment of the correctness of NOE restraints, (iv) assessment of overall structure quality, and (v) the identif icat ion of key structural features (via mult iple geometric and heurist ic measures). It accepts chemical shifts, PDB coordinates and NOE restraint tables as input. From these NMRQ performs extensive, unbiased and fully automated analyses and generates high quality tables, graphs and color images of both collect ive and individual structure/ chemical shift propert ies and quality scores. Addit ionally we introduce the concept of the chemical shift R factor and demonstrate how chemical shifts can serve as independent, orthogonal measures of structure quality relat ive to standard NOE assessments. The NMRQ server is freely available athttp:/ / wishart.biology.ualberta.ca/ NMRQ

Summary Reports

NMRQ identif ies anomalies, outliers, and other peculiar propert ies at the residue, model, and ensemble levels. These possible problems are presented as Summary Reports.

(a) Residue Summaries: Possible problems are tabulated for each residue. Residue reports also include per- residue plots (per- residue Ramachandran plots, chi- distribut ion, chi1- chi2 normality plots, etc) and tabular summaries of the various analyses.

(b) Model Summaries: The problematic residues for each model are shaded red. The darkness of the shading corresponds to the number of identif ied problems per residue. The identif ied anomalies for each problematic residue are tabulated below the model.

(c) Ensemble Summaries: Residues with persistent problems, i.e. that appear in a majority of the models, are mapped to a representative model for the ensemble.

Model and Ensemble quality are scored by comparing to a database of high quality X- ray structures (Richardson top 500 [1]) and NMR structures (RECOORD [2]). The scores are mapped to equivalent resolution and plotted as scatterplots (in development).

Superposition Report

NMRQ features a superposit ioning algorithm that can automatically identify the well- ordered regions of the ensemble and restrict the superposit ion to these regions. A table (not shown) reports the C-alpha, backbone, heavy atom, and all- atom RMSDs (to average) for the well- ordered regions and for all residues. (a) The superposit ioned ensemble is rendered in stereo-view and coloured by model. A hyperlinked legend facilitates drill-down to the individual models. (b)The ensemble is presented with shading to indicate the degree of local RMSD. (c) A per- residue RMSD plot shows local RMSD variability. The secondary structure overlay can be used to examine the correspondence of secondary structure elements and RMSD.

a

b

c

a b c

a

Statistics Report

Combinations of torsion angle, volume, and accessible surface area values are combined to create (a) Stereo/ Packing quality plots, and (b) 3D Profile Quality plots. Many other stat ist ics (not shown) are also reported here.

Statistics

a

Torsion Angle Report

NMRQ reports on the quality ofφ , ψ , χ and ω torsion angles. (a) The phi+ psi variability plot highlights the ordered and disordered regions of the ensemble. (b) The Ramachandran plot quality can be assessed using the φ , ψ distribut ions derived fromthe Richardson Top 500 [1] , RECOORD [2], or Morris et al . [3]. Tables display the residues in disallowed region of the plot.

Not shown but included are χ1

dispersion plots, χ1- χ

2 normality

plots χ1

angle favorability plots, and χ

1- Accessible Surface Area

covariance plots.

b

a a

b b

b