cezary czaplewski faculty of chemistry university of gdańsk poland
DESCRIPTION
All-atom molecular simulations of protein folding and unfolded-state dynamics and structure with accelerated calculations on GPU. Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland. The 10th Protein Folding Winter School, KIAS, February, 7-11, 2011. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/1.jpg)
All-atom molecular simulations of protein folding and unfolded-state dynamics and
structure with accelerated calculations on GPU
Cezary CzaplewskiFaculty of ChemistryUniversity of GdańskPoland
The 10th Protein Folding Winter School, KIAS, February, 7-11, 2011
![Page 2: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/2.jpg)
Molecular Simulation of ab Initio Protein Folding for
a Millisecond Folder NTL9(1-39)
Vincent A. Voelz,1 Gregory R. Bowman,2 Kyle Beauchamp,2 Vijay S. Pande1,2,3
1 Department of Chemistry, Stanford University, 2 Biophysics Program, Stanford University
3 Department of Structural Biology Stanford University
J. AM. CHEM. SOC. 2010, 132, 1526–1528
![Page 3: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/3.jpg)
• Computer simulations, validated by experiment, can help gain a complete understanding of how proteins fold.
• Over a million-fold range in folding rates = possible diversity in folding mechanism.
• Folding@Home using GPU allowing for several folding trajectories of 39-residue NTL9(1-39), the slowest-folding protein (~1.5 ms folding time) folded ab initio with all-atom model MD to date.
• Insights into folding mechanism based on Markov state model (MSM).
![Page 4: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/4.jpg)
10-15femto
10-12pico
10-9nano
10-6micro
10-3milli
100seconds
bond vibration
loopclosure
helixformation
folding of-hairpins
proteinfolding
all atom MD step
sidechainrotation
![Page 5: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/5.jpg)
GPU
• Type of CPU attached to a graphics card dedicated to calculating floating point operations
• Incorporates stream processing microchips which contain special mathematical operations
• Stream Processing: applications can use multiple computational units without explicitly managing allocation, synchronization, or communication among those units.
![Page 6: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/6.jpg)
CPU vs. GPU
CPU – 4 cores
![Page 7: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/7.jpg)
Floating-Point Operations per Second for the CPU and GPU
![Page 8: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/8.jpg)
![Page 9: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/9.jpg)
Trp-cage 4.1 msPitera, Swope, PNAS 2003
Proteins folded ab initio by all atom MD
Fip35 WW 13 msEnsign, Pande, Biophys. J., 2009
Villin headpiece 10 msZagrovic, Snow, Shirts, Pande, JMB 2002
Fast folding villin variant <1 msEnsign, Kasson, Pande, JMB 2007
![Page 10: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/10.jpg)
NTL9(1-39)~1.5 ms
experimental folding time
![Page 11: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/11.jpg)
• Folding@Home using Gromacs with OpenMM library written specially for GPU allowing dramatically longer trajectories
• AMBER ff96 with Onufriev, Bashford,Case GBSA• Up to 10000 parallel MD simulations at 300, 330, 370 and 450K• Starting from native, random coil, extended• Aggregate 1.52 ms • Out of the ~3000 trajectories started from unfolded states at
370K only two reach <3.5 Å RMSD and eight <4 Å RMSD• Number of folding events is consistent with a simple model of
parallel uncoupled folding as a two-state Poisson process: ⟨n = ∫M(t)k exp(-M(t) kt) dt⟩
M(t) is the number of parallel simulations that reach time t.k is ~640/s experimental folding rate
![Page 12: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/12.jpg)
Distributions of rmsd for native-state simulations of NTL9(1−39) after 10 μs
The number of parallel simulations at 370 K that reach time t.
Posterior predictions of the folding rate
![Page 13: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/13.jpg)
A snapshot from a folding trajectory 3.1 Å RMSD
Non-native and native-like hydrophobic core arrangements
![Page 14: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/14.jpg)
Markov state model (MSM)• MSM constitutes a kinetic clustering• Conformations that can interconvert rapidly are grouped into the
same state• Conformations that can only interconvert slowly are grouped into
separate states• Satisfies the Markov property—the identity of the next state
depends only on the identity of the current state and not any of the previous states
• Transition probability matrix T propagates state probabilities p
• An implied timescale k for given lag time t can be calculated from the eigenvalues m of matrix T
![Page 15: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/15.jpg)
Detail of MSMBuilder package
• 100,000 microstates were generated by clustering conformations separated by 10 ns using k-centers algorithm
• The remaining 90% of the data was then assigned to these clusters• The resulting microstates had an average radius of ~4.5 Å • A macrostate model generated by lumping microstates into 2,000
macrostates using the Robust Perron Cluster Analysis (PCCA+) algorithm
• Although only a few folding trajectories were observed directly, a network of many possible pathways can be inferred from the overlapping sampling of local transitions.
• Top 10 folding fluxes, calculated by a greedy backtracking algorithm
![Page 16: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/16.jpg)
Implied timescales Markov State Models (MSMs) built at lag times between 1 and 32 ns
100,000-microstate model 2000-macrostate model
![Page 17: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/17.jpg)
A scatter plot of the 2000 macrostates Shown in red are the 14 macrostates transited by the top ten pathway fluxes
![Page 18: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/18.jpg)
A 2000-state Markov State Model (MSM).
The top 10 folding pathways account for 25% of the ∼total flux and transit 14 of the 2000 macrostates
![Page 19: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/19.jpg)
Contact profile subspaces used to calculate Qa Q12 Q13
natnat
nat
ccccQ
c(x)– contact profile indexed by x = (i, j)
![Page 20: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/20.jpg)
The 14 macrostates plotted along structural and kinetic reaction coordinates
![Page 21: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/21.jpg)
Contact profiles for the 14 macrostates involvedin the top folding pathways
![Page 22: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/22.jpg)
Values of Q for each of the 14 macrostates involved in the top ten folding pathways
![Page 23: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/23.jpg)
Q-values plotted versus pfold (committor) values
![Page 24: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/24.jpg)
Macrostates l, m and n have very similar structural ensembles and similar pfold values
These states differ mostly intheir hairpin registrations and packing of the hairpin loop.
![Page 25: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/25.jpg)
Conclusions
• Existing force field models using implicit solvent are accurate enough to fold proteins ab initio at long time scales, opening the door to simulating more structurally complex proteins.
• There need not be a single pathway or single, dominant mechanism for the folding of a given protein.
• Multiple mechanisms could be simultaneously present .
• The sequence of the protein, coupled with the chemical environment, control the balance to which each mechanistic pathway is seen.
![Page 26: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/26.jpg)
![Page 27: Cezary Czaplewski Faculty of Chemistry University of Gdańsk Poland](https://reader036.vdocuments.site/reader036/viewer/2022062410/568164da550346895dd72918/html5/thumbnails/27.jpg)
Take-home message• GPU can speed up your simulations 10 times• Existing force field models using implicit solvent are
accurate enough to fold proteins during MD.• With only a few folding trajectories observed directly,
a network of many possible pathways can be inferred from kinetic clustering using the Markov State Model.
• Several pathways for the folding of a given protein.• Multiple folding mechanisms (a diffusion-collision or
nucleation-condensation) could be simultaneously present .