computational representation of biological molecules
DESCRIPTION
Computational Representation of Biological Molecules. Michel F. Sanner. The Molecular Graphics Laboratory. The Scripps Research Institute La Jolla, California. CRBM Sept. 9-10, 2003 UCSD, San Diego, Ca. TSRI. Protein-Ligand. Molecular Surfaces. AutoDock. MSMS. HARMONY. Protein-Protein. - PowerPoint PPT PresentationTRANSCRIPT
Computational Representationof Biological Molecules
Michel F. Sanner
The Scripps Research InstituteLa Jolla, California
The Molecular Graphics Laboratory
CRBM Sept. 9-10, 2003 UCSD, San Diego, Ca
TSRI
Protein-LigandProtein-Ligand
Complex AssembliesComplex AssembliesProtein-ProteinProtein-Protein
Molecular SurfacesMolecular SurfacesAutoDockAutoDock
SurfDockSurfDock
MSMSMSMS
HARMONYHARMONY
TSRI
Tangible Models
TSRI
Augmented Reality
TSRI
The challengeVisualizationVisualization
DockingDockingMethodsMethods
FoldingFolding
ProteinProteinEngineeringEngineering
SequenceSequenceAnalysisAnalysis
ModelingModeling
MM - MDMM - MD
Ab InitioAb InitioMethodsMethods
ElectrostaticsElectrostaticsCalculationsCalculations
MolecularMolecularSurfacesSurfaces
Etc ...Etc ...
TSRI
Python to the rescueHigh level language as a scripting environmentHigh level language as a scripting environment
Molecular
Molecular
Surfaces
Surfaces
Molecular
Molecular
Surfaces
Surfaces
Molecules
Molecules
Molecules
Molecules
DataDataBaseBase
DataDataBaseBase
Electros
tatics
Electros
tatics
Electros
tatics
Electros
tatics
Delaunay
Delaunay
Delaunay
Delaunay
Homology
Homology
Homology
HomologyCSGCSGCSGCSG
3D3DViewerViewer
3D3DViewerViewer
MM-MD
MM-MD
MM-MD
MM-MD
NewNewMethodMethod
NewNewMethodMethod
YourYour
Metho
d
Metho
dYourYour
Metho
d
Metho
d
TSRI
Software components• MolKit:
– read/write/represent/manipulate and query molecules
• DejaVu:– General purpose 3D geometry viewer
• ViewerFramework:– Visualization application template
• Mslib, PyBabel, PyMead, SFF, ... (Sophie I. Coon, Michel F. Sanner and Art J. Olson,
Re-usable components for structural bioinformatic, (9th Python Conference 2001)
TSRI
from MolKit.pdbParser import PdbParserparser = PdbParser(‘1crn.pdb’)mols = parser.parse( )
from MolKit.pdbParser import PdbParserparser = PdbParser(‘1crn.pdb’)mols = parser.parse( )
PDBMol2PQR...
Parser
MoleculeSet
PDB parser MOL2 parserMolecule
Chain
Residue
Atom
Molecule
Chain
Residue
Atom
Molecule
Residue
Atom
Molecule
Atom
MolKit MolKit
TSRI
MolKit
NumericNumeric
MolKitMolKitTreeNode
.parent
.children
...
TreeNode
adopt(child)
TreeNodeSet(ListSet)
[TreeNode1, TreeNode2, … ].__getattr__(self, name)
.top
.elementType
TreeNode
[ TreeNode1.name, TreeNode2.name, …]returns
.name
TSRI
TreeNode, TreeNodeSet
• TreeNodeSet:– Boolean operation– uniq( ) – split( )– sort( )– NodesFromName( )– findChildrenOfType( )– findParentOfType( )– …
• TreeNode:– adopt( ) / remove( )– full_name( )– NodeFromName( )– split( ) / merge( )– getParentOfType( )– findType( )– compare( )– assignUniqIndex( )– isAbove( ) / isBelow( )– …
TSRI
TreeNode and TreeNodeSet specialization
R es id u e C h a in P ro te in ...
M o lecu le A tom
H elix S tran d Tu rn C o il
S econ d ayS tru c tu re ...
TreeN od e
R es id u eS et C h a in S et P ro te in S e t ...
M o lecu leS et A tom S et
H e lixS e t S tran d S et Tu rn S e t C o ilS e t
S econ d ayS tru c tu reS et ...
T reeN od eS et
TSRI
Examples
>>> from MolKit import Read
>>> molecules = Read(‘./1crn.pdb’) # Read returns a ProteinSet
>>> mol = molecules[0]
>>> print mol.chains.residues.name
>>> print mol.chains.residues.atoms[20:85].full_name()
>>> from MolKit.molecule import Atom
>>> allAtoms = mol. findType(Atom)
>>> set1 = allAtoms.get(lambda x: x.temperatureFactor >20)
>>> allResidues = allAtoms.parent.uniq()
>>> import Numeric
>>> for r in allResidues:
. . . coords = r.atoms.coords
. . . r.geomCenter = Numeric.sum(coords) / len(coords)
MolKit
TSRI
MolKit Features
• Pdb, PQR, mol2 parsers
• Support for Sets
• Selection mechanism using Python syntax
• Secondary Structure
• Amber parameters assignment
• Atomic radii assignment (regexp)
TSRI
MolKit Critic: BAD
• No well defined API
• Hierarchy of proteins
• Danger of stamping over attributes
• Slow for large structures
• Hierarchical structure not used much
• Re quires a Python interpreter
Severe
Mild
TSRI
MolKit Critic: GOOD• Clever PDB parser
• Lightweight and Platform independent
• Dynamic (on the fly creation of attributes)
• Introspection
• Support for Set and Set operations
• Python syntax-based selection
Re-usable component (has been combined withmany Python packages developed independently)
TSRI
DEMO