function-process links the theory. why bother? to improve the ontology to fill in annotation gaps as...

Post on 04-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Function-process links

The theory

Why bother?• To improve the ontology• To fill in annotation gaps• As an aid to annotation

– Suggest new annotations– Avoid redundant annotation effort– Annotation cross-products

• Better integration with pathway databases• To present annotations to users in more useful ways

– e.g. more informative AmiGO displays

GO in 2008

Filling in annotation gaps

GO:0016301kinase activityGO:0016301

kinase activityGO:0016310

phosphorylationGO:0016310

phosphorylation

|P| = 3640|F| = 6053|F ∩ P| = 2230|F ∩ not P| = 3823

2230

14103823

July 2008

Filling in annotation gaps

GO:0016301kinase activityGO:0016301

kinase activity

GO:0016310 phosphorylationGO:0016310 phosphorylation

Future - 2009

Improved presentation to users

part_of

part_of

annotations propagateover part_of

KIC1 IDA

part_of

annotations propagateover part_of

KIC1 IDA

part_of

annotations propagateover part_of

NDK1IDA

part_of

annotations propagateover part_of

NDK1IDA

A quick review of part_of

• Means “always part of some”– Example:

• nucleus part_of cell• EVERY nucleus is part_of SOME cell

Mining pathway DBs for links

glycolysisglycolysis

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGOMF

BP

Mining pathway DBs for links

glycolysisglycolysis

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGO

xrefxref

xrefxrefxrefxref

Mining pathway DBs for links

glycolysisglycolysis

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGO

xrefxref

xrefxrefxrefxref

has_part has_part

xrefs: not necessarily equivalent

glycolysis [human]glycolysis [human]

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGO

equivalentequivalent

equivalentequivalent

equivalentequivalent

has_part? has_part?

GO:newGO:newGO:newGO:new

is_a is_a

GO:newGO:new

is_a

xrefs: not necessarily equivalent

glycolysis [human]glycolysis [human]

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGO

equivalentequivalent

equivalentequivalent

equivalentequivalent

some_has_part

some_has_part

GO:newGO:newGO:newGO:new

is_a is_a

GO:newGO:new

is_a

has_part

xrefs: not necessarily equivalent

glycolysis [human]glycolysis [human]

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

fructose bisphosphatase

activity of fructose 1 6

bisphosphatase 2 _cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glucose 6 phosphate isomerase activity of glucose 6 phosphate

isomerase dimer_cytosol

glycolysisglycolysis

fructose-bisphosphate

aldolase activity

fructose-bisphosphate

aldolase activity

glucose-6-phosphate isomerase

activity

glucose-6-phosphate isomerase

activity

reactomeGO

xrefxref

xrefxref

some_has_part

some_has_part

has_part

Specifics

• Low Hanging Fruit– Function to process links

• Mostly part_of links• Some regulates links

• Pathways– Process to function

• has_part

– Mining from pathways databases & curation

Function-process links

Conclusions of the electron transport working group.

UTP:glucose-1-phosphate uridylyltransferase activity α-D-glucose 1-phosphate + UTP ->

UDP-D-glucose + diphosphate

glucose metabolic process

UDP-glucosemetabolicprocess galactose

metabolic process

biosyntheticprocess

colanic acidbiosyntheticprocess

responseto desiccation

carbohydrate catabolic process

Function

Processhp hp

hphp hp

hp hp

ureacycle

arginosuccinate synthase activity Catalysis of the reaction: ATP + L-citrulline + L-aspartate = AMP

+diphosphate + (N(omega)-L-arginino)succinate

arginine

biosynthetic

process

polyamine biosynthesis

Function

Process

hp hp hp

carbamoyl-phosphate synthase activityCatalysis of a reaction that results in

the formation of carbamoyl phosphate.

Urea cycle andmetabolism ofamino groups

Glutamate

metabolism

Arginine

and proline

metabolism

Nitrogen

metabolism

Function

Processhp hp hp hp

Lysine biosynthesis pathways

lysinebiosynthesis

lysinebiosynthesis1

lysinebiosynthesis3

lysinebiosynthesis2

lysinebiosynthesis4

lysinebiosynthesis5

lysinebiosynthesis6

Function

Process

is_a

is_ais_a is_a is_a

is_a

lysinebiosynthesis 7?

is_a

Process

Function

Lysine Biosynthesis

Shared function?

= has_part

Non-shared function

new GO term

existing GO term

Process

Function

Lysine Biosynthesis

Process B

Shared function?

= has_part

Non-shared function

new GO term

existing GO term

Process

Function

Lysine Biosynthesis

Process B Process C

Shared function?

= has_part

Non-shared function

new GO term

existing GO term

Process

Function

Lysine Biosynthesis

Process B Process CRelationship explosion

(or Editorial office explosion)

Where do pathways start and end?

A B C D

process 1

process 2

process 3

Use cases

• Can we slim from function up to process?

• Can we infer annotations to process from those to function?

has_part

has_function

has_function but only as part_of polyamine biosynthesis

has_function but only as part_of urea cycle

urea cyclepolyamine biosynthesis

arginosuccinate synthase activity

Gene product x Gene product y

Function

Process

Gene products

has_part

has_function

urea cyclepolyamine biosynthesis

Gene product x Gene product y

Function

Process

Gene products

?

has_part

has_function

urea cyclepolyamine biosynthesis

Gene product x Gene product y

Function

Process

Gene products

No

has_partcannot beused for slimming.

Can we infer annotations to process from those to function?

• No. There is too much variation in process details, and too many functions are shared.

So what can we do?

phosphorylation

kinase activity

Function

Process

part_of

We can make relationships between single step processes and their respective functions.

glucose transport

glucose transporter activity

Function

Process

part_of

We can make any obvious relationship where part_of holds, and this will allow useful slimming.

We can mine the other links from pathway databases and make non-curated sometimes_part_of links.

sometimes_part_of

What does this buy us?

• Very full coverage of function-process links. • No manual link curation.

What work does it involve?

• We maintain the mapping files e.g. reactome2go.• We write the mining scripts.• Work with pathway dbs to unify exchange formats and make data interoperable

Acknowledgements

Michelle Gwinn-GiglioDebbie SiegeleIngrid KeselerHarold DrabkinJennifer DeeganChris MungallPeifen Zhang

top related