an introduction to bio-ontologies robert stevens [email protected]

38
An Introduction to Bio-Ontologies Robert Stevens [email protected]

Upload: elijah-malone

Post on 12-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

An Introduction to Bio-Ontologies

Robert Stevens

[email protected]

Page 2: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Introduction

• How we do bioinformatics• What is knowledge• What is an ontology• Classes, individuals, …• The components of an ontology• Examples

Page 3: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

How We Do bioinformatics

• No Euclid, no Newton• No equations and no axioms• Cannot take an amino acid sequence, submit to an

equation and get some biology• … so we do similarity searches

Page 4: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Transferring Characteristics

Uncharacterised protein

Tra1 La2 La3

High similarity transfer characteristics

Page 5: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What do we Transfer?

• When sequences sufficiently similar we transfer what we understand about one sequence to another

• The “understanding” is our knowledge about that protein

Page 6: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What is Knowledge?

• Knowledge – all information and an understanding to carry out tasks and to infer new information

• Information -- data equipped with meaning

• Data -- un-interpreted signals that reach our senses

Michael AshburnerProfessor

University of CambridgeUK

ISMB

NameJob

InstitutionCountry

Conf

manacademic, senior

ancient university, 5 ratedEuropean

important figure in biology

BIOLOGY

Page 7: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Uniprot:- A protein database?ID PRIO_HUMAN STANDARD; PRT; 253 AA.AC P04156;DT 01-NOV-1986 (Rel. 03, Created)DT 01-NOV-1986 (Rel. 03, Last sequence update)DT 20-AUG-2001 (Rel. 40, Last annotation update)DE Major prion protein precursor (PrP) (PrP27-30) (PrP33-35C) (ASCR).GN PRNP.OS Homo sapiens (Human).OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo.OX NCBI_TaxID=9606;RN [1]RP SEQUENCE FROM N.A.RX MEDLINE=86300093; PubMed=3755672;RA Kretzschmar H.A., Stowring L.E., Westaway D., Stubblebine W.H.,RA Prusiner S.B., Dearmond S.J.;RT "Molecular cloning of a human prion protein cDNA.";RL DNA 5:315-324(1986).RN [2]RP SEQUENCE OF 8-253 FROM N.A.RX MEDLINE=86261778; PubMed=3014653;RA Liao Y.-C.J., Lebo R.V., Clawson G.A., Smuckler E.A.;RT "Human prion protein cDNA: molecular cloning, chromosomal mapping,RT and biological implications.";RL Science 233:364-367(1986).RN [3]RP SEQUENCE OF 58-85 AND 111-150 (VARIANT AMYLOID GSS).RX MEDLINE=91160504; PubMed=1672107;RA Tagliavini F., Prelli F., Ghiso J., Bugiani O., Serban D.,RA Prusiner S.B., Farlow M.R., Ghetti B., Frangione B.;RT "Amyloid protein of Gerstmann-Straussler-Scheinker disease (IndianaRT kindred) is an 11 kd fragment of prion protein with an N-terminalRT glycine at codon 58.";RL EMBO J. 10:513-519(1991).RN [4]RP STRUCTURE BY NMR OF 118-221.RX MEDLINE=20359708; PubMed=10900000;RA Calzolai L., Lysek D.A., Guntert P., von Schroetter C., Riek R.,RA Zahn R., Wuethrich K.;RT "NMR structures of three single-residue variants of the human prionRT protein.";RL Proc. Natl. Acad. Sci. U.S.A. 97:8340-8345(2000).CC -!- FUNCTION: THE FUNCTION OF PRP IS NOT KNOWN. PRP IS ENCODED IN THECC HOST GENOME AND IS EXPRESSED BOTH IN NORMAL AND INFECTED CELLS.CC -!- SUBUNIT: PRP HAS A TENDENCY TO AGGREGATE YIELDING POLYMERS CALLEDCC "RODS".CC -!- SUBCELLULAR LOCATION: ATTACHED TO THE MEMBRANE BY A GPI-ANCHOR.CC -!- POLYMORPHISM: THE FIVE TANDEM OCTAPEPTIDE REPEATS REGION IS HIGHLYCC UNSTABLE. INSERTIONS OR DELETIONS OF OCTAPEPTIDE REPEAT UNITS ARECC ASSOCIATED TO PRION DISEASE.

FT SIGNAL 1 22FT CHAIN 23 230 MAJOR PRION PROTEIN.FT PROPEP 231 253 REMOVED IN MATURE FORM (BY SIMILARITY).FT LIPID 230 230 GPI-ANCHOR (BY SIMILARITY).FT CARBOHYD 181 181 N-LINKED (GLCNAC...) (PROBABLE).FT DISULFID 179 214 BY SIMILARITY.FT DOMAIN 51 91 5 X 8 AA TANDEM REPEATS OF P-H-G-G-G-W-G-FT Q.FT REPEAT 51 59 1.FT REPEAT 60 67 2.FT REPEAT 68 75 3.FT REPEAT 76 83 4.FT REPEAT 84 91 5.FT IN PATIENTS WHO HAVE A PRP MUTATION ATFT CODON 178: PATIENTS WITH MET DEVELOP FFI,FT THOSE WITH VAL DEVELOP CJD).FT /FTId=VAR_006467.FT VARIANT 171 171 N -> S (IN SCHIZOAFFECTIVE DISORDER).FT /FTId=VAR_006468.FT VARIANT 178 178 D -> N (IN FFI AND CJD).FT /FTId=VAR_006469.FT VARIANT 180 180 V -> I (IN CJD).FT /FTId=VAR_006470.FT VARIANT 183 183 T -> A (IN FAMILIAL SPONGIFORMFT ENCEPHALOPATHY).FT /FTId=VAR_006471.FT VARIANT 187 187 H -> R (IN GSS).FT /FTId=VAR_008746.FT VARIANT 188 188 T -> K (IN EOAD; DEMENTIA ASSOCIATED TOFT PRION DISEASES).FT /FTId=VAR_008748.FT VARIANT 188 188 T -> R.FT /FTId=VAR_008747.FT VARIANT 196 196 E -> K (IN CJD).FT /FTId=VAR_008749.FT /FTId=VAR_006472.SQ SEQUENCE 253 AA; 27661 MW; 43DB596BAAA66484 CRC64;MANLGCWMLV LFVATWSDLG LCKKRPKPGG WNTGGSRYPG QGSPGGNRYP PQGGGGWGQP HGGGWGQPHG GGWGQPHGGG WGQPHGGGWG QGGGTHSQWN KPSKPKTNMK HMAGAAAAGA VVGGLGGYML GSAMSRPIIH FGSDYEDRYY RENMHRYPNQ VYYRPMDEYS NQNNFVHDCV NITIKQHTVT TTTKGENFTE TDVKMMERVV EQMCITQYER ESQAYYQRGS SMVLFSSPPV ILLISFLIFL IVG//

CC -!- DISEASE: PRP IS FOUND IN HIGH QUANTITY IN THE CC BRAIN OF HUMANS AND ANIMALS INFECTEDCC WITH NEURODEGENERATIVE DISEASES KNOWN ASCC TRANSMISSIBLE SPONGIFORM ENCEPHALOPATHIES OR PRION CC DISEASES,LIKE: CREUTZFELDT-JAKOB DISEASE (CJD), CC GERSTMANN-STRAUSSLER SYNDROME (GSS), FATAL CC FAMILIAL INSOMNIA (FFI) AND KURU IN HUMANS; CC SCRAPIE IN SHEEP AND GOAT; BOVINE SPONGIFORM CC ENCEPHALOPATHY (BSE) IN CATTLE; TRANSMISSIBLE CC MINK ENCEPHALOPATHY (TME); CHRONIC WASTINGCC DISEASE (CWD) OF MULE DEER AND ELK; FELINE CC SPONGIFORM ENCEPHALOPATHY (FSE) IN CATS AND CC EXOTIC UNGULATE ENCEPHALOPATHY (EUE) IN CC NYALA AND GREATER KUDU. THE PRION DISEASES CC ILLUSTRATE THREE MANIFESTATIONS OF CNS CC DEGENERATION: (1) INFECTIOUS (2)CC SPORADIC AND (3) DOMINANTLY INHERITED FORMS.CC TME, CWD, BSE, FSE, EUE ARE ALL THOUGHT TOCC OCCUR AFTER CONSUMPTION OF PRION-INFECTEDCC FOODSTUFFS.DR EMBL; M13667; AAA19664.1; -.DR EMBL; M13899; AAA60182.1; -.DR EMBL; D00015; BAA00011.1; -.DR PIR; A05017; A05017.DR PIR; A24173; A24173.DR PIR; S14078; S14078.DR PDB; 1E1G; 20-JUL-00.DR PDB; 1E1J; 20-JUL-00.DR PDB; 1E1P; 20-JUL-00.DR PDB; 1E1S; 21-JUL-00.DR PDB; 1E1U; 20-JUL-00.DR PDB; 1E1W; 20-JUL-00. DR MIM; 176640; -.DR MIM; 123400; -.DR MIM; 137440; -.DR MIM; 245300; -.DR MIM; 600072; -.DR MIM; 604920; -.DR InterPro; IPR000817; Prion.DR Pfam; PF00377; prion; 1.DR PRINTS; PR00341; PRION.DR SMART; SM00157; PRP; 1.DR PROSITE; PS00291; PRION_1; 1.DR PROSITE; PS00706; PRION_2; 1.KW Prion; Brain; Glycoprotein; GPI-anchor; Repeat; Signal;KW 3D-structure; Polymorphism; Disease mutation.

Page 8: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

A Web of Knowledge in Bioinformatics

Page 9: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Words in Bioinformatics

“When I use a word,” Humpty Dumpty said in rather a scornful tone, “it means just what I choose it to mean - neither more nor less.”

“The question is,” said Alice, “whether you can make words mean so many different things.”

“The question is,” said Humpty Dumpty, “which is to be master - that’s all”

Through the Looking Glass Lewis Carroll

Page 10: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Post-Genomic Biology

• Fly, mouse, yeast, worm all have their own terminologies

• I want to compare genomes• How?• Sequences comparable• What we know about sequences is not (by human or

machine)• Need a common understanding of what sequences

do

Page 11: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

A Shared Understanding

• Synonyms and homonyms are rife• Need to know that terms in one resource mean the

same in another resource• Means comparisons are much easier: Can ask

questions over many resources• A structure of relationships enables discovery and

query abstractions• Useful for both humans and computers• The Gene Ontology allows queries outside one model

organismm

Page 12: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Gene Ontology http://www.geneontology.org

“a dynamic controlled vocabulary that can be applied to all eukaryotes”

Built by the community for the community.

Three organising principles: Molecular function, Biological

process, Cellular component Describes kinds of things and

parts of things Describes ~17,000 things

Page 13: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

London Bills of Mortality

Page 14: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Aggregated Stats

Page 15: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

The art of ranking things in genera and species is of no small importance and very much assists our judgment as well as our memory. You know how much it matters in botany, not to mention animals and other substances, or again moral and notional entities as some call them. Order largely depends on it, and many good authors write in such a way that their whole account could be divided and subdivided according to a procedure related to genera and species. This helps one not merely to retain things, but also to find them. And those who have laid out all sorts of notions under certain headings or categories have done something very useful.

Gottfried Wilhelm Leibniz, New Essays on Human Understanding

Page 16: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Ontology

• Semantics – the meaning of meaning.• Philosophical discipline, branch of philosophy that

deals with the nature and the organisation of reality.• Science of Being (Aristotle, Metaphysics, IV,1)• What is being?• What are the features common to all beings?

Page 17: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

So What?

• Describing what “exists” in our domain• We have Protein, Gene, Intron, Exon, Hydrolase activity, etc.• We can also describe how these “things” relate to each other• We can define what they mean; define the properties of these

things such that we can recognise those things• We are capturing our understanding• Sharing this understanding between humans and computer• Making what we understand explicit

Page 18: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What Is An Ontology?

• No universally agreed-upon definition• A “specification of a conceptualisation”• Conceptualisation refers to the set of concepts that

people use to talk about a given domain and the relationships among these concepts

• A set of vocabulary terms and definitions that capture a community’s understanding of their domain

• CS has perverted the original philosophy• Ontology == conceptual model of a domain

Page 19: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What Is An Ontology?

Elements that most agree on:

– classes = sets of things– instances = members of classes– relationships– axioms = additional logical statements

Page 20: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What Is An Ontology?

• Idea of a controlled vocabulary:

– Each element has a unique name– Each element has a specified definition– For a given entity or relationship in the

domain, there should only be one element in the ontology representing it

– Ask for hydrolase actibity” and get all and only hydrolase activity

Page 21: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What Is an Ontology?

• Hierarchy (or taxonomy) is very important:

– Classes arranged into a hierarchy

– subclass = descendant class– direct subclass = child class– superclass = ancestor class– direct superclass = parent

class

Page 22: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What Is an Ontology?

• Can be a single hierarchy, in which each class can only have one direct superclass, or a multiple hierarchy (or polyhierarchy), in which each class can have more than one direct superclass

• is-a relationship between a class and its superclass(es)• A class inherits the properties that have been defined for its

superclass(es)

Page 23: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Why develop an ontology?

• To make domain assumptions explicit

– Easier to change domain assumptions– Easier to understand and update legacy data

• To separate domain knowledge from operational knowledge

– Re-use domain and operational knowledge separately

• A community reference for applications• To share a consistent understanding of what information means.

Page 24: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Classes

• Classes: Sets of things in the world (nouns)• Classes of individuals• Classes: Person, protein, gene, DNA• Individuals: Robert (NE 67 51 48A), a LARD protein,

a TrpA gene, a bacterium O23912• Classes represent the things we know in our domain

Page 25: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Properties

• Classes have properties that describe their nature• Properties held by the individuals in a class• Properties made by relationships to individuals in

other classes• Some properties must be held by a class• These are necessary to be a member of a class• Some properties are sufficient to define membership

of a class• These are sufficient to recognise an individual as

being a class member

Page 26: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Classes

• Primitive classes:

– properties are necessary– Globular protein must have hydrophobic core, but

a protein with a hydrophobic core need not be a globular protein

• Defined classes:

– properties are necessary + sufficient– Eukaryotic cells must have a nucleus. Every cell

that contains a nucleus must be Eukaryotic.

Page 27: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

An explicit description of a domain

• Rather than arguing about meaning of words• We argue about characteristics of things• Experience shows writing a list of characteristics or properties

describing a “thing” saves much time• Computationally useful – gives a computer something to work

with…

animal

rodent cowcat

mouse

eats

dog

domesticvermin

Page 28: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Classification of the Classical Tyrosine Phosphatases

Page 29: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Incremental Addition of Protein Functional Domains

Phosphatase catalytic

Cadherin-like

Immunoglobulin

MAM domain Cellular retinaldehyde

Adhesion recognition Transmembrane

Fibronectin III Glycosylation

Page 30: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Determining Class Definitions for Phosphatases

R2A

- Contains 2 protein tyrosine phosphatase domains

- Contains 1 transmembrane domain

- Contains 4 fibronectin domains

- Contains 1 immunoglobulin domain

- Contains 1 MAM domain

- Contains 1 cadherin-like domain

Form complete OWL descriptions and clasify

Page 31: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

What is the Ontology Telling Us?

• Each class of phosphatase defined in terms of domain composition

• We know the characteristics by which an individual protein can be recognised to be a member of a particular class of phosphatase

• We have this knowledge in a computational form• If we had protein instances described in terms of the

ontology, we could classify those individual proteins• A catalogue of phosphatases

Page 32: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Classification of Protein Tyrosine Phosphatases

Page 33: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

So what is an ontology?

Catalog/ID

Thesauri

Terms/glossary

Informal Is-a

FormalIs-a

Formalinstance

Frames(properties)

General Logicalconstraints

Valuerestrictions

Disjointness,Inverse, partof

Gene Ontology

Mouse AnatomyEcoCyc

PharmGKB

TAMBISArom

[Deborah McGuinness, Stanford]

Types of Description

Class of Service: Domain description SWISS-PROT service

Abstract Service: Inputs, outputs, algorithm,

Service instance: As offered by EBI, NCBI Invoked instance: What was called

myGrid Ontologies

Bioinformatics ontology

Web serviceontology

Task ontology

Publishingontology

Informatics ontology

Molecularbiology ontology

Organisationontology

Upper levelontology

Specialises. All concepts are subclassed from those in the more general ontology.

Contributes concepts to form definitions.

Using a Service Ontology

Discovery of an appropriate Web Service within a registry by its properties and capabilities;

Invocation by some agent; Interoperability is increased by describing

the semantic type of inputs and outputs; Composition of new services; Verification of a service’s properties; Execution monitoring by tracking what is

happening to the described aspects of a service and its sub-services.

Service Classifications

Classifications of the descriptions act as:

an index to the descriptions; a filtering mechanism; a query containment mechanism; and a substitution mechanism such that partial

or imprecise matches can be catered for and a clustering mechanism for similar services.

Problems with ProfileThree Steps to Discovering & Preparing a ServiceOntologies & Services

Typing

Controlling inputs and outputs of services.

Mapping between WSDL / OGSA XML Schema types to (DAML+OIL) concepts

Classifying

Indexing services and data

OGSA – factories and service instances.

Organising services based on reasoning over the service descriptions.

A simple single axial ontology describing

sequence alignment services

Sequence alignment

Pairwise Multiple

SmithWaterman BLAST

BLASTn BLASTp tBLASTn

What do ontologies offer?

Common framework for integration OpenMMS, TAMBIS, ONION

Search support, querying & matching GO, MGED, UMLS, MeSH

Intelligent interfaces for queries and data capture Ingenuity web based products, TAMBIS.

Control + Semantics

Cop

yrig

ht ©

199

8 Pa

ngea

Sys

tem

s, Inc

. A

ll ri

ghts

res

erve

d.

What is Knowledge?

Knowledge – all inf ormation and an understanding to carry out tasks and to inf er new inf ormation

I nf ormation -- data equipped with meaning

Data -- un-interpreted signals that reach our senses

Protein kinase C

Michael AshburnerProfessor

University of CambridgeUK

IGF

NameJ ob

I nstitutionCountry

Conf

manacademic, senior

ancient university, 5 ratedEuropean

important fi gure in biology

BIOLOGY

Types of Description

Class of Service: Domain description SWISS-PROT service

Abstract Service: Inputs, outputs, algorithm,

Service instance: As offered by EBI, NCBI Invoked instance: What was called

myGrid Ontologies

Bioinformatics ontology

Web serviceontology

Task ontology

Publishingontology

Informatics ontology

Molecularbiology ontology

Organisationontology

Upper levelontology

Specialises. All concepts are subclassed from those in the more general ontology.

Contributes concepts to form definitions.

Using a Service Ontology

Discovery of an appropriate Web Service within a registry by its properties and capabilities;

Invocation by some agent; Interoperability is increased by describing

the semantic type of inputs and outputs; Composition of new services; Verification of a service’s properties; Execution monitoring by tracking what is

happening to the described aspects of a service and its sub-services.

Service Classifications

Classifications of the descriptions act as:

an index to the descriptions; a filtering mechanism; a query containment mechanism; and a substitution mechanism such that partial

or imprecise matches can be catered for and a clustering mechanism for similar services.

Problems with ProfileThree Steps to Discovering & Preparing a ServiceOntologies & Services

Typing

Controlling inputs and outputs of services.

Mapping between WSDL / OGSA XML Schema types to (DAML+OIL) concepts

Classifying

Indexing services and data

OGSA – factories and service instances.

Organising services based on reasoning over the service descriptions.

A simple single axial ontology describing

sequence alignment services

Sequence alignment

Pairwise Multiple

SmithWaterman BLAST

BLASTn BLASTp tBLASTn

What do ontologies offer?

Common framework for integration OpenMMS, TAMBIS, ONION

Search support, querying & matching GO, MGED, UMLS, MeSH

Intelligent interfaces for queries and data capture Ingenuity web based products, TAMBIS.

Control + Semantics

Cop

yrig

ht ©

199

8 Pa

ngea

Sys

tem

s, Inc

. A

ll ri

ghts

res

erve

d.

What is Knowledge?

Knowledge – all inf ormation and an understanding to carry out tasks and to inf er new inf ormation

I nf ormation -- data equipped with meaning

Data -- un-interpreted signals that reach our senses

Protein kinase C

Michael AshburnerProfessor

University of CambridgeUK

IGF

NameJ ob

I nstitutionCountry

Conf

manacademic, senior

ancient university, 5 ratedEuropean

important fi gure in biology

BIOLOGY

Page 34: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

EcoCyc

Page 35: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Gene Ontology http://www.geneontology.org

Page 36: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

Controlled vocabulary• AGROVOC: Agricultural Vocabulary

Page 37: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

UMLS (Unified Medical Language System) http://umlsks.nlm.nih.gov/

• National Library of Medicine (NLM) database of medical terminology. Terms from several medical databases (MEDLINE, SNOMED International, Read Codes, etc.) are unified so that different terms are identified as the same medical concept.

• Metathesaurus provides the concordance of medical concepts: 730.000 concepts, 1.5 million concept names in different source vocabularies

• Specialist lexicon provides word synonyms, derivations, lexical variants, and grammatical forms of words used in MetaThesaurus terms: 130,000 entries.

• Semantic Network codifies the relationships (e.g. causality, "is a", etc.)

among medical terms: 134 semantic types, 54 relationships.

Page 38: An Introduction to Bio-Ontologies Robert Stevens Robert.Stevens@manchester.ac.uk

An Ontology Building Life-cycle

Identify purpose and scope

Knowledge acquisition

Evaluation

Language and representation

Available development tools

Conceptualisation

Integrating existing ontologiesEncoding

Building

Ontology Learning

Consistency Checking