![Page 1: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/1.jpg)
High Throughput Computing and Protein
Structure
Stephen E. Hamby
![Page 2: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/2.jpg)
Overview• Introduction To Protein Structure• Dihedral Angles• Previous Work• Support Vector Regression• Optimisation• Prediction• Results• Conclusions
![Page 3: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/3.jpg)
Introduction To Protein Structure
Molecules with massive biological importance
Structure determination gives insight into ….
• Function, Dynamics, Potential drug targets.
Experimental structure determination is….
• Expensive, Slow, Difficult
![Page 4: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/4.jpg)
Introduction To Protein Structure
Primary Structure:
Order of Amino Acids
Secondary Structure:
Building blocks
Tertiary Structure:
Complete 3D Structure
![Page 5: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/5.jpg)
Introduction To Protein Structure
Secondary Structure Types
α-helix
β-sheet
Random Coil
![Page 6: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/6.jpg)
Dihedral Angles
![Page 7: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/7.jpg)
Dihedral Angles
![Page 8: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/8.jpg)
Dihedral Angles
Finding the secondary structure of a protein is a step towards finding its complete structure
Predicting dihedral angles can help us to get the secondary structure
How Can We Predict Dihedral Angles?
![Page 9: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/9.jpg)
Previous work
Destruct
Multiple neural networks.
Iterative method.
Predicts secondary structure
and dihedral angles.
![Page 10: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/10.jpg)
Previous work
Twin neural networks give a consensus prediction.
Predicts dihedral angles from various amino acid properties amino acid composition and predicted structure.
Real Spine
![Page 11: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/11.jpg)
Support Vector Regression
Kernel machine learning raises the data to a higher dimension so a linear relationship can be found.
![Page 12: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/12.jpg)
Support Vector Regression
Attempts to fit a linear function to the data in a high dimensional feature space
Accurate but…
Slow, needs optimisation, black box.
![Page 13: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/13.jpg)
Support Vector Regression
Kernel Choice
We tested the various kernels available through the PyML package.
These the are linear, polynomial, and gaussian kernels.
We tested them using the CASP4 dataset.
Gaussian kernel produced the best results.
![Page 14: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/14.jpg)
Optimisation
Three interdependent parameters
Grid based optimisation on a the CASP4 dataset
Around 10000 3 hour jobs.
Run in blocks of 10 on Jupiter
Accuracy assessed using the Pearson correlation coefficient
![Page 15: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/15.jpg)
Prediction
Support vector machine using a Gaussian kernel and optimal parameters.
Training on the CB513 dataset.
Tested by 10 fold cross validation
CASP 4 used as a test set.
![Page 16: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/16.jpg)
Results
Destruct Real Spine SVM Prediction
Pearson Correlation Coefficient
0.42 0.62 0.57
CASP4 Test set gives Pearson Correlation Coefficient of 0.56
Results measured by cross validation
![Page 17: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/17.jpg)
Results
Using Secondary structure predictions made by cascade correlation neural networks:
Dihedrals assisted by predicted structure Pearson correlation coefficient 0.582.
Subsequent iterations should lead to better predictions of both structure and dihedral angles.
![Page 18: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/18.jpg)
What Next?
Using further iterations to improve accuracy.
Current method is a black box.
Can we use a program like Trepan to get some definite rules about secondary structure.
![Page 19: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/19.jpg)
Conclusions
• Dihedral Angles define protein secondary structure
• Using Support Vector Machines it is possible to predict dihedral angles
• We (hopefully!) can use predicted dihedral angles to improve the accuracy of secondary structure prediction.
![Page 20: High Throughput Computing and Protein Structure](https://reader035.vdocuments.site/reader035/viewer/2022070500/56816837550346895dddfa93/html5/thumbnails/20.jpg)
Acknowledgements
Jonathan Hirst
Hirst group members
BBSRC
The University of Nottingham