proteinshop: a tool for protein structure prediction and modeling silvia crivelli computational...
TRANSCRIPT
![Page 1: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/1.jpg)
ProteinShop: A Tool for Protein Structure Prediction and
Modeling
Silvia Crivelli
Computational Research Division Lawrence Berkeley National Laboratory
![Page 2: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/2.jpg)
The Protein Structure Prediction Problem
To determine how proteins, the building
blocks of living cells, fold themselves into
three-dimensional shapes that define the
role they play in life.
![Page 3: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/3.jpg)
Importance of Protein Structure Prediction
• The shape of a protein determines its function.• Knowledge of structure is used in many ways:
– Drug design– Design of synthetic proteins– Re-engineering defective proteins
• Genome projects are providing sequences for many proteins whose structure will need to be determined.
![Page 4: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/4.jpg)
Protein Structures
ProGly Leu Ser
Proteins consist of a long chain ofamino acids, the primary structure
N
O H
RH
N
O H
R H
N
O H
RH
N
O H
R H
N
OH
R H
N
OH
R H
N
OH
R H
N
OH
R H
Side chain
H-bond
Backbone
Amino acid
![Page 5: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/5.jpg)
Protein Structures
ProGly Leu Ser
Proteins consist of a long chain ofamino acids, the primary structure
The constituent amino acids may encourage hydrogen bonding that form regular structures, called secondary structures
The secondary structures fold together to form a compact 3-dimensional shape, calledthe tertiary structure
-helix -sheet
![Page 6: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/6.jpg)
The problem can be formulated as a global minimization problem, as it is assumed that the
tertiary structure occurs at the global minimum of the free energy function of the primary sequence
Ab Initio Approach
Our Goal: To provide an approach that relies more on physical principles than on information from known proteins
![Page 7: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/7.jpg)
Ab Initio MethodTertiary structure is
believed to minimize potential energy:
Min VMM(x)where x = atom coordinates
Difficulties: Proposed energy function may not match natureO(en2) local minimaVery large parameter space
e.g., modestly sized protein100 amino acids~ 1,600 atoms~ 4,800 variables
![Page 8: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/8.jpg)
The Search Algorithm
Given the amino acid sequence of aprotein, find the global minimum of
the free energy function.
GenerateStarting
Configurations
GlobalOptimization
Phase 1 Phase 2
![Page 9: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/9.jpg)
Secondary Structure Predictions in Phase 1
SKIGIDGFGRIGRLVLRAALSCGAQ
SKIGIDGFGRIGRLVLRAALSCGAQCBBBB BCCCAAAAAAACCCBBBBBC1135522356789992888566733
Sequence:Type:
Weight:
Sequence:
Servers predict secondary structure likely to be in a target protein based on a large database of known proteins.
![Page 10: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/10.jpg)
Matching the predicted strands is a combinatorial problem
Which strands are paired?
Which orientation?
? ??
parallel anti-parallel
Which residues are paired?
odd even
![Page 11: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/11.jpg)
There are n!2 n-2 possible n-stranded motifs
96 motifs for n=4 960 motifs for n=5
It takes weeks tocreate some of theseconfigurations usingconstrained localminimizations!
Distribution of Beta Sheets in Proteins with Applications to Structure Prediction
Ruckzinski, Kooperberg, Bonneau, and Baker, Proteins 48,2002
![Page 12: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/12.jpg)
CASP4 Competition
• Fourth community-wide experiment on the
Critical Assessment of Techniques for
Protein Structure Prediction (2000)
• Our group predicted 8 proteins
•Largest protein had 240 aa
•Most complex fold had 2 β-strands
![Page 13: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/13.jpg)
ProteinShop• Interactive tool for protein manipulation• Designed to quickly create initial configurations
• It takes weeks to create a number of configurations using constrained minimizations
• It takes a few hours to create the same configurations with ProteinShop
![Page 14: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/14.jpg)
Phase 1 with ProteinShop
Phase 1
Amino Acid Sequence
Phase 2
Initial Configurations
Final Configuration
2ndary StructurePrediction
GeometryGeneration
Structure Sequence
DirectManipulation
Pre-configuration
Initial Configurations
ProteinShoptakes minutes
![Page 15: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/15.jpg)
![Page 16: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/16.jpg)
![Page 17: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/17.jpg)
![Page 18: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/18.jpg)
![Page 19: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/19.jpg)
![Page 20: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/20.jpg)
CASP4 Competition (before ProteinShop)
CASP5 Competition (with ProteinShop)
•Our group predicted 20 proteins
•Largest protein had 417 aa
•Most complex fold had 13 β-strands
•Our group predicted 8 proteins
•Largest protein had 240 aa
•Most complex fold had 2 β-strands
![Page 21: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/21.jpg)
Phase 2
Phase 1
Amino Acid Sequence
Phase2: GlobalOptimization
Initial Configurations
Final Configuration
SubspaceSelection
Initial Configurations
SubspaceOptimization
CandidateSelection
Final Configuration
Takes months to converge using hundreds of processors on Seaborg!
![Page 22: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/22.jpg)
Phase 2 with ProteinShop
Phase 1
Amino Acid Sequence
Phase2: GlobalOptimization
Initial Configurations
Final Configuration
SubspaceSelection
Initial Configurations
SubspaceOptimization
CandidateSelection
Final Configuration
MonitoringSystem
DirectManipulation
Steering System
Will reduce computation time
![Page 23: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/23.jpg)
Monitoring System• Monitor progress of overall optimization/each
optimization process
![Page 24: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/24.jpg)
Monitoring System
• Monitor progress of overall optimization/each optimization process
• Alert user to important events during optimization• A sudden drop in internal energy• A group of processes getting stuck
• Test new heuristics for expanding nodes of the tree
![Page 25: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/25.jpg)
Steering System
• Change configurations during optimization to account for developments not anticipated during Phase 1
• Manipulate proteins that don’t seem to be realistic or that are stuck in a local minimum
• Allow pruning of the optimization tree•Assign multiple processes to a configuration that just had a drop in internal energy•Assign stuck processes to other configurations
![Page 26: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/26.jpg)
Plans for the FutureUse of the monitoring and steering features to develop and test a new method for protein structure prediction
Compete in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction)
Expand and enhance ProteinShop
![Page 27: ProteinShop: A Tool for Protein Structure Prediction and Modeling Silvia Crivelli Computational Research Division Lawrence Berkeley National Laboratory](https://reader030.vdocuments.site/reader030/viewer/2022032606/56649ea75503460f94baa3ff/html5/thumbnails/27.jpg)
O. Kreylos, N. Max, B. Hamann,
S. Crivelli, and W. Bethel. Interactive Protein Manipulation, Winner of the Best Application
Award IEEE Visualization 2003, Seattle.
ProteinShop
Available to academic and non-profit organizations
proteinshop.lbl.gov