the gene ontology project: content for the semantic web

34
The Gene Ontology Project: Content for the Semantic Web

Upload: jasmine-roche

Post on 27-Mar-2015

232 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: The Gene Ontology Project: Content for the Semantic Web

The Gene Ontology

Project:Content for the Semantic Web

Page 2: The Gene Ontology Project: Content for the Semantic Web

• Compile structured vocabularies describing aspects of molecular biology

• Describe gene products using vocabulary terms (annotation)

• Develop tools:• to query and modify the vocabularies and annotations• annotation tools for curators

GO Project Goals

Page 3: The Gene Ontology Project: Content for the Semantic Web

GO provides two bodies of data:

• Terms with definitions and cross- references

• Gene product annotations with supporting data

GO Data

Page 4: The Gene Ontology Project: Content for the Semantic Web

•Molecular Function — elemental activity or task

nuclease, DNA binding, transcription factor

•Biological Process — broad objective or

goalmitosis, signal transduction, metabolism

•Cellular Component — location or complexnucleus, ribosome, origin recognition complex

The Three Ontologies

Page 5: The Gene Ontology Project: Content for the Semantic Web

DAG Structure

Directed acyclic graph: each child may have one or more

parents

Page 6: The Gene Ontology Project: Content for the Semantic Web

• is-asubclass; a is a type of b

• part-ofphysical part of (component)subprocess of (process)

Relationship Types

Page 7: The Gene Ontology Project: Content for the Semantic Web

Every path from a node back to the root must be biologically accurate

The True Path Rule

Page 8: The Gene Ontology Project: Content for the Semantic Web

• ID

• Text string

• Definition with source

• Synonyms (optional)

• Cross-references (optional)

GO Terms: Associated Data

Page 9: The Gene Ontology Project: Content for the Semantic Web

• Enzyme Commission (EC)

• Transport Commission (TC)

• University of Minnesota Biocatalysis/ Biodegradation Database (UM-BBD)

• MetaCyc

GO Terms: Cross-References

Page 10: The Gene Ontology Project: Content for the Semantic Web

• Association between gene product and applicable GO terms

• Provided by member databases

• Made by manual or automated methods

GO Annotation

Page 11: The Gene Ontology Project: Content for the Semantic Web

• Database object: gene or gene product

• GO term ID

• Reference

•publication or computational method

• Evidence supporting annotation

GO Annotation: Data

Page 12: The Gene Ontology Project: Content for the Semantic Web

DAG Structure

Annotate to any level within DAG

Page 13: The Gene Ontology Project: Content for the Semantic Web

• Improve coverage:• Developmental processes• Physiological processes

• Relational database

• Support ontology development for additional domains of biology

The Future of GO:

Page 14: The Gene Ontology Project: Content for the Semantic Web

• Names of gene products

• Protein domains

• Protein sequence features

• Phenotypes; diseases

• Anatomical terms (except as part of terms generated by cross-products)

Terms outside the Scope of GO

Page 15: The Gene Ontology Project: Content for the Semantic Web

• Global Open Biology Ontologies

• Umbrella site for shared genomics and proteomics vocabularies

• Present incarnation: subdirectory within GO repository:

ftp://ftp.geneontology.org/pub/go/gobo/README

The GOBO Proposal

Page 16: The Gene Ontology Project: Content for the Semantic Web

• FlyBase & Berkeley Drosophila Genome Project • WormBase• Saccharomyces Genome Database • DictyBase• Mouse Genome Informatics • Compugen, Inc• The Arabidopsis Information Resource• Swiss-Prot/TrEMBL/InterPro

• Pathogen Sequencing Unit (Sanger Institute)

• PomBase (Sanger Institute)

• Rat Genome Database

• Genome Knowledge Base (CSHL)

• The Institute for Genomic Research

www.geneontology.org

The Gene Ontology Consortium is supported by NHGRI grant HG02273 (R01). The Gene Ontology project thanks AstraZeneca for financial support. The Stanford group acknowledges a gift from Incyte Genomics.

Page 17: The Gene Ontology Project: Content for the Semantic Web

Conference:

Standards and Ontologies for Functional Genomics (SOFG)

Towards unified ontologies for describing biology and biomedicine

17 – 20 November 2002

Hinxton Hall Conference CentreHinxton, Cambridge, UK

www.ebi.ac.uk/SOFG/

Page 18: The Gene Ontology Project: Content for the Semantic Web

First Standards and Ontologies for Functional Genomics

(SOFG)

Keynote SpeakersKen Buetow, NCI, USA

Win Hide, SANBI, South Africa Peter Karp, SRI International, USA

17-20 November 2002,

Hinxton, UK

Page 19: The Gene Ontology Project: Content for the Semantic Web

Aims and Objectives

• Bring together scientists developing standards and ontologies, both biologists, bioinformaticians and computer scientists

Page 20: The Gene Ontology Project: Content for the Semantic Web

Topics

• Introduction to Ontologies• Tools for building ontologies• Go and related ontologies• Species specific ontologies• Implementation• Inter-ontology mapping• Ontologies for pathology, toxicology• Chemical ontologies

Page 21: The Gene Ontology Project: Content for the Semantic Web

Structure

• 3 keynote speakers

• ~20 invited talks

• 10 short talks selected from poster abstracts

• Panel discussion

• Parallel working groups/tutorials

Page 22: The Gene Ontology Project: Content for the Semantic Web

Programme CommitteeMichael Ashburner, University of Cambridge, UK (Chair)

Cathy Ball, Stanford University, USA Mike Bittner, NHGRI, USA

Alvis Brazma, EMBL-EBI, UK Catherine Brooksbank, EMBL-EBI, UK

Duncan Davidson, MRC HGU, Edinburgh, UK Liz Ford, EMBL-EBI, UK

Midori Harris, EMBL-EBI, UKVictor Markowitz, Gene Logic, USA

Helen Parkinson, EMBL-EBI, UKJohn Quackenbush, TIGR, USA

Martin Ringwald, The Jackson Laboratories, USASteffen Schulze-Kremer, RZPD, Germany

Paul Spellman, U.C. Berkeley, USA Robert Stevens, University of Manchester, UK

Chris Stoeckert, University of Pennsylvania, USA

Page 23: The Gene Ontology Project: Content for the Semantic Web

URL

http://www.ebi.ac.uk/microarray/General/Events/SOFG/SOFG.html

Page 24: The Gene Ontology Project: Content for the Semantic Web

chitin metabolism: before revision

The True Path Rule

chitin biosynthesis

cuticle synthesis

chitin catabolism

cell wall biosynthesis

chitin metabolism

Page 25: The Gene Ontology Project: Content for the Semantic Web

chitin metabolism: after revision

The True Path Rule

Page 26: The Gene Ontology Project: Content for the Semantic Web

chitin metabolism: after revision

The True Path Rule

cuticle synthesischitin metabolism

cuticle chitin biosynthesis

chitin biosynthesis cuticle chitin metabolism

Page 27: The Gene Ontology Project: Content for the Semantic Web

• Open source

• Can be instantiated in DAML+OIL or GO syntax

• Orthogonal

• Shared ID space

• Defined terms

GOBO Criteria

Page 28: The Gene Ontology Project: Content for the Semantic Web

hexose glucose fructose

DAG Cross-Products

metabolism biosynthesis catabolism

hexose metabolism hexose biosynthesis glucose biosynthesis fructose biosynthesis hexose catabolism glucose catabolism fructose catabolism glucose metabolism

... etc.

Page 29: The Gene Ontology Project: Content for the Semantic Web

gene gene_attribute gene_structure SO gene_variation ME gene_product gene_product_attribute molecular_function GO protein_family INTERPRO phenotype mutant phenotype

anatomy

For complete current draft see ftp://ftp.geneontology.org/pub/go/gobo/README

Some GOBO Ontologies

Page 30: The Gene Ontology Project: Content for the Semantic Web

• Not a way to unify biological databases

• Not a dictated standard

• Does not define evolutionary relationships

• Additional ontologies needed to model biology and experimentation

What GO is NOT:

Page 31: The Gene Ontology Project: Content for the Semantic Web

DAG Structure

Annotate to any level within DAG

mitosisS.c. NNF1

mitotic chromosome condensation

S.c. BRN1, D.m. barren

Page 32: The Gene Ontology Project: Content for the Semantic Web

Using GO Annotation: Example Workflow

text

Page 33: The Gene Ontology Project: Content for the Semantic Web

ID

definition

cross-reference

synonyms

Page 34: The Gene Ontology Project: Content for the Semantic Web

Using GO Annotation: Example Workflow