psi structural genomics knowledgebase

44
PSI Structural Genomics Knowledgebase Helen M. Berman Bottlenecks Workshop April 14, 2008

Upload: keaton-mcintyre

Post on 02-Jan-2016

30 views

Category:

Documents


0 download

DESCRIPTION

PSI Structural Genomics Knowledgebase. Helen M. Berman Bottlenecks Workshop April 14, 2008. PSI Information Site. Knowledgebase. Knowledgebase Vision. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: PSI Structural Genomics Knowledgebase

PSI Structural Genomics Knowledgebase

Helen M. Berman

Bottlenecks Workshop

April 14, 2008

Page 2: PSI Structural Genomics Knowledgebase

Kn

ow

ledg

ebase

Page 3: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Knowledgebase Vision The PSI Structural Genomics Knowledgebase

(PSI SG KB) will turn the products of the PSI effort into major advances in knowledge that can be used to understand living systems and human disease

The PSI SG KB will be a key resource for the advancement of biology, biochemistry, functional genomics, pharmacology, bioinformatics, chemistry, education and clinical medicine

Page 4: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Knowledgebase GoalsTo provide a “marketplace of ideas” that

connects protein sequence information to 3D structures and homology models

enhances functional annotations provides access to new experimental protocols and

materialsTo kick start and enable advancements in structural genomics

by communicating and providing visibility and accessibility of information and technology advances of the PSI

through presentation and discussion of the most provocative challenges with the general community

by fostering community collaborations

Page 5: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

To capture, make accessible, and highlight elements of the high throughput pipelines for general use in the community and to leverage such information through the generation of hundreds of thousands of molecular models and functional annotation. Standard metrics will be used to measure progress.

GenomicBased Target

SelectionData

CollectionStructure

DeterminationIsolation, Expression,

Purification,Crystallization

PDBDeposition & Release

ModelsAnnotationsPublications

Metrics

Technology

Experimental Tracking

Scope

Target Selection Materials

Page 6: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Knowledgebase Users Biologists Biochemists Functional Genomists Pharmacologists Bioinformatics Chemists Clinical Researchers and

Physicians Teachers and Students

Page 7: PSI Structural Genomics Knowledgebase

KB Site Features

News and

Events

Molecules of Unknown

Function

Link to Functional

SleuthGallery

FeaturedStructure

Link toTechnologyModule

TechnologyFeature

Search by - Sequence - Keyword - PDB ID

Page 8: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

PSI SG KB Portal Collects sequences, common features, and common

identifiers Maintains correspondences in local database Delivers aggregate reports, inventories, and e-

publications which contain links to PSI projects, modules and external resources

Delivers featured articles describing: PSI news and events, featured molecules and technologies, molecules of unknown function

Provides collaborative environments for discussion, annotation, and target suggestions

Page 9: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

PDB ID

Sequence

Keyword

Queries

PSI Modules

PSI Centers

PSI Info Site

Related Biological Resources

Archival Sequence Databases

Domain Databases (Pfam)

Literature (PubMed)

TargetDB

PepcDB PDB

TargetDB Sequences

PDB Sequences

Portal

Resource

Database

KeywordDatabase

PSI SG KB Portal Databases

Models Portal

Page 10: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Modules

Modules derived from PSI information and external resources Target Selection & Experimental Data Tracking Materials Repository Models Annotation Metrics Technology Outreach

Page 11: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Target Selection & Experimental Data Tracking Target Selection – PSI-2 BIG4

Family definitions and target management TargetDB

Search by sequence, Target ID, project site, status, update date, protein name, and source organism

Links to other sequence databases, domain databases, other structural genomics centers, and PDB

Download target data Target statistics summary

PepcDB All the functionality of TargetDB plus– Experimental protocols– Detailed status history of experimental trials – Information on failed experiments

Page 12: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Experimental TrackingPepcDB Search Form

Protocol Keywords Search

Page 13: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 14: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Experimental Tracking Module

Page 15: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 16: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Materials Repository

Page 17: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

PSI Materials Repository Module

Page 18: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 19: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Modeling Portal

Current Phase 1 Model Portal contains

Models from 4 PSI centers and 2 public model

databases (SwissModel and ModBase) integrated on

a common UniProt reference system.

Current release consists of 5.8 million comparative

protein models for 1.97 million distinct UniProt

entries.

Page 20: PSI Structural Genomics Knowledgebase
Page 21: PSI Structural Genomics Knowledgebase
Page 22: PSI Structural Genomics Knowledgebase
Page 23: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Modeling Portal

Page 24: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Metrics Module

Provides objective measures of the progress and output of the PSI project

Centered around “Goals and Milestones” document

Page 25: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

PSI-2 Summary StatisticsUpdated April 1, 2008

I.1.A Number of novel experimental PSI-2 structures 1031

I.1.B Number of distinct experimental PSI-2 structures non-redundant sequences

1428

I.1.D Total number of experimental PSI-2 structures 1628

I.1.E Numbers of experimentally determined distinct residues 319977

Numbers of experimentally determined novel residues 225518

I.2.J Number of experimental structures of human proteins 61

I.2.K Number of experimental structures of eukaryotic proteins 186

I.2.M Number of experimental structures of membrane proteins 1

I.2.N Number of experimental structures determined at the atomic level using x-ray crystallography

1484

Number of experimental structures determined at the atomic level using NMR methods

144

Page 26: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

PSI-2 Summary Statistics for Domain and Modeling Leverage

I.1.C Number and Size of BIG Domain Families for which PSI-2 provides the first Experimental Structure Representative

474

Number and Size of MEGA Domain Families for which PSI-2 provides the first Experimental Structure Representative

399

I.1.E Numbers of Experimentally Determined Distinct BIG Family Residues

76579

Numbers of Experimentally Determined Distinct MEGA Family Residues

76121

I.3.A Total Modeling Leverage 583735

I.3.B Novel Modeling Leverage 114407

Updated January 15, 2008

Updated February 21, 2008

Page 27: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Technology Module

GenomicBased Target

SelectionData

CollectionStructure

DeterminationPDB

Deposition & Release

FunctionalAnnotation

Publication

PSI Centers are actively developing technologies and methodologies for all aspects of the structure determination pipeline

Isolation, Expression,Purification,Crystallization

Page 28: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Technology Module Progress

Phase 1 Technology Portal in place Summary Information from all PSI Centers Keyword search from KB portal

Page 29: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 30: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 31: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 32: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 33: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 34: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 35: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Outreach Module

Provides information to the public about the products and accomplishments of the PSI

Media reports Publications Community activities Plans for a Nature Gateway

Page 36: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 37: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Current Annotation Module

10 PSI Interactive Services for Sequence, Structure and Functional Annotations

11 PSI Galleries and Summaries of Sequence, Structure and Functional Annotations

35 other resources for annotation

Provides paths to unravel sequence, structure, function relationships

Page 38: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Annotation Module

Page 39: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Page 40: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Biological Annotation of Novel ProteinsMarch 7,8 2008 Calit2, UCSD

Participants PSI groups Annotation system authors General biological community

Outcome Recommendations for standard annotations Processes for community input

Page 41: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Standard Annotations

Genomic features: gene identifier, name and synonyms, operon/regulon mappings

Protein sequence features: amino acid sequence, taxonomy & phylogeny, sequence database accession, isoform, SNPs, PTMs, sequence families, residue conservation. Structure features: oligomeric state, structure and functional domains, DNA binding motifs, nests & clefts, sites of interaction, residue regions of protein-protein, ligand-protein, catalytic sites, secondary structure, structural neighbors and comparison of groups of structures with common feature, properties/features mapped to 3D and their similarities (e.g. electrostatics, cavities, conserved residues, quality assessment ) Ligands: chemical structure, interactions, functional role.

Functional classification: GO, FunCat, EC, epitope mapping, cellular location, organ location, substrate specificity, disease involvement Mapping to Biological Systems: mapping to networks and pathways (e.g. Reactome, Kegg, HPRD, BioCyc, Reactome, KEGG, HPRD, NetPath, MINT, MIPS, DIP, STRING, STITCH, PROLINKS) Literature: synonyms for protein names, links to PubMed by database identifier and related text and authors

Page 42: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Future ImprovementsExperimental Data Tracking - Standardization of the protocols in PepcDB PepcDB data deposition tool Integration with the Materials Repository

Materials Repository - Searchable database of clones Ordering system Integration with PepcDB and PSI SGKB

Models Module - Public web service interface Additional quality assessment Interactive homology modeling

Page 43: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Future Improvements

Technology Module - Improved navigation over technology topic areas Keyword search option of descriptions and publicationsPSI SGKB - Integration with Nature Gateway Simple presentation and search of standard annotations Incorporation of data about ligands and modified-residues Molecular visualization tool

Page 44: PSI Structural Genomics Knowledgebase

PSI SG Knowledgebase

Acknowledgements

KB Team Modules Wendy Tao Torsten Schwede (Models)Raship Shah Andrei Kouranov (Exp. Data Tracking)James Chun Paul Adams (Technology)John Westbrook Wladek Minor (Publications)

Josh La Baer (Materials)Rajesh Nair (Metrics)

Access Informationhttp://kb.psi-structuralgenomics.org/KB/