computational representation of biological molecules

16
Computational Representation of Biological Molecules Michel F. Sanner The Scripps Research Institute La Jolla, California The Molecular Graphics Laboratory CRBM Sept. 9-10, 2003 UCSD, San Diego, Ca TSRI

Upload: rashida-turan

Post on 01-Jan-2016

38 views

Category:

Documents


2 download

DESCRIPTION

Computational Representation of Biological Molecules. Michel F. Sanner. The Molecular Graphics Laboratory. The Scripps Research Institute La Jolla, California. CRBM Sept. 9-10, 2003 UCSD, San Diego, Ca. TSRI. Protein-Ligand. Molecular Surfaces. AutoDock. MSMS. HARMONY. Protein-Protein. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computational Representation of Biological Molecules

Computational Representationof Biological Molecules

Michel F. Sanner

The Scripps Research InstituteLa Jolla, California

The Molecular Graphics Laboratory

CRBM Sept. 9-10, 2003 UCSD, San Diego, Ca

TSRI

Page 2: Computational Representation of Biological Molecules

Protein-LigandProtein-Ligand

Complex AssembliesComplex AssembliesProtein-ProteinProtein-Protein

Molecular SurfacesMolecular SurfacesAutoDockAutoDock

SurfDockSurfDock

MSMSMSMS

HARMONYHARMONY

Page 3: Computational Representation of Biological Molecules

TSRI

Tangible Models

Page 4: Computational Representation of Biological Molecules

TSRI

Augmented Reality

Page 5: Computational Representation of Biological Molecules

TSRI

The challengeVisualizationVisualization

DockingDockingMethodsMethods

FoldingFolding

ProteinProteinEngineeringEngineering

SequenceSequenceAnalysisAnalysis

ModelingModeling

MM - MDMM - MD

Ab InitioAb InitioMethodsMethods

ElectrostaticsElectrostaticsCalculationsCalculations

MolecularMolecularSurfacesSurfaces

Etc ...Etc ...

Page 6: Computational Representation of Biological Molecules

TSRI

Python to the rescueHigh level language as a scripting environmentHigh level language as a scripting environment

Molecular

Molecular

Surfaces

Surfaces

Molecular

Molecular

Surfaces

Surfaces

Molecules

Molecules

Molecules

Molecules

DataDataBaseBase

DataDataBaseBase

Electros

tatics

Electros

tatics

Electros

tatics

Electros

tatics

Delaunay

Delaunay

Delaunay

Delaunay

Homology

Homology

Homology

HomologyCSGCSGCSGCSG

3D3DViewerViewer

3D3DViewerViewer

MM-MD

MM-MD

MM-MD

MM-MD

NewNewMethodMethod

NewNewMethodMethod

YourYour

Metho

d

Metho

dYourYour

Metho

d

Metho

d

Page 7: Computational Representation of Biological Molecules

TSRI

Software components• MolKit:

– read/write/represent/manipulate and query molecules

• DejaVu:– General purpose 3D geometry viewer

• ViewerFramework:– Visualization application template

• Mslib, PyBabel, PyMead, SFF, ... (Sophie I. Coon, Michel F. Sanner and Art J. Olson,

Re-usable components for structural bioinformatic, (9th Python Conference 2001)

Page 8: Computational Representation of Biological Molecules

TSRI

from MolKit.pdbParser import PdbParserparser = PdbParser(‘1crn.pdb’)mols = parser.parse( )

from MolKit.pdbParser import PdbParserparser = PdbParser(‘1crn.pdb’)mols = parser.parse( )

PDBMol2PQR...

Parser

MoleculeSet

PDB parser MOL2 parserMolecule

Chain

Residue

Atom

Molecule

Chain

Residue

Atom

Molecule

Residue

Atom

Molecule

Atom

MolKit MolKit

Page 9: Computational Representation of Biological Molecules

TSRI

MolKit

NumericNumeric

MolKitMolKitTreeNode

.parent

.children

...

TreeNode

adopt(child)

TreeNodeSet(ListSet)

[TreeNode1, TreeNode2, … ].__getattr__(self, name)

.top

.elementType

TreeNode

[ TreeNode1.name, TreeNode2.name, …]returns

.name

Page 10: Computational Representation of Biological Molecules

TSRI

TreeNode, TreeNodeSet

• TreeNodeSet:– Boolean operation– uniq( ) – split( )– sort( )– NodesFromName( )– findChildrenOfType( )– findParentOfType( )– …

• TreeNode:– adopt( ) / remove( )– full_name( )– NodeFromName( )– split( ) / merge( )– getParentOfType( )– findType( )– compare( )– assignUniqIndex( )– isAbove( ) / isBelow( )– …

Page 11: Computational Representation of Biological Molecules

TSRI

TreeNode and TreeNodeSet specialization

R es id u e C h a in P ro te in ...

M o lecu le A tom

H elix S tran d Tu rn C o il

S econ d ayS tru c tu re ...

TreeN od e

R es id u eS et C h a in S et P ro te in S e t ...

M o lecu leS et A tom S et

H e lixS e t S tran d S et Tu rn S e t C o ilS e t

S econ d ayS tru c tu reS et ...

T reeN od eS et

Page 12: Computational Representation of Biological Molecules

TSRI

Examples

>>> from MolKit import Read

>>> molecules = Read(‘./1crn.pdb’) # Read returns a ProteinSet

>>> mol = molecules[0]

>>> print mol.chains.residues.name

>>> print mol.chains.residues.atoms[20:85].full_name()

>>> from MolKit.molecule import Atom

>>> allAtoms = mol. findType(Atom)

>>> set1 = allAtoms.get(lambda x: x.temperatureFactor >20)

>>> allResidues = allAtoms.parent.uniq()

>>> import Numeric

>>> for r in allResidues:

. . . coords = r.atoms.coords

. . . r.geomCenter = Numeric.sum(coords) / len(coords)

MolKit

Page 13: Computational Representation of Biological Molecules

TSRI

MolKit Features

• Pdb, PQR, mol2 parsers

• Support for Sets

• Selection mechanism using Python syntax

• Secondary Structure

• Amber parameters assignment

• Atomic radii assignment (regexp)

Page 14: Computational Representation of Biological Molecules

TSRI

MolKit Critic: BAD

• No well defined API

• Hierarchy of proteins

• Danger of stamping over attributes

• Slow for large structures

• Hierarchical structure not used much

• Re quires a Python interpreter

Severe

Mild

Page 15: Computational Representation of Biological Molecules

TSRI

MolKit Critic: GOOD• Clever PDB parser

• Lightweight and Platform independent

• Dynamic (on the fly creation of attributes)

• Introspection

• Support for Set and Set operations

• Python syntax-based selection

Re-usable component (has been combined withmany Python packages developed independently)

Page 16: Computational Representation of Biological Molecules

TSRI

DEMO