biological expression language overview

34
August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 1 Biological Expression Language Overview

Upload: elisa

Post on 18-Feb-2016

60 views

Category:

Documents


1 download

DESCRIPTION

August 2012 This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Biological Expression Language Overview

August 2012

This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA.

1

Biological Expression Language Overview

Page 2: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 3: Biological Expression Language Overview

BEL Statements

• Basic statement types:

3

Term Expression Relationship Term Expression

Term Expression

complex(p(HGNC:CCND1), p(HGNC:CDK4))

p(HGNC:CCND1) directlyIncreases kin(p(HGNC:CDK4))

Page 4: Biological Expression Language Overview

BEL Statements

4

a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance")

Term Expression Relationship Term Expression

The abundance of molecules designated by the name

“corticosteroid” in the CHEBI namespace.

The pathology designated by the name “Insulin

Resistance” in the MESHD namespace.

Page 5: Biological Expression Language Overview

BEL Statements

5

a(CHEBI:corticosteroid) -> path(MESHD:"Insulin Resistance")

Term Expression Relationship Term Expression

increases

Page 6: Biological Expression Language Overview

BEL Statements

• Complex statement type:– A causal statement can be used as the target term of a

causal statement

6

Term Expression Causal Relationship Causal Statement

p(HGNC:CLSPN) -> (kin(p(HGNC:ATR)) => p(HGNC:CHEK1, pmod(P)))

Page 7: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 8: Biological Expression Language Overview

BEL Statement Annotations

• Annotations provide information about one or more BEL Statements

8

SET Citation = {"PubMed", "J Mol Med", "12682725", "2003-03-14","Limbourg FP|Liao JK",""}

SET Evidence = "high-dose steroid treatment decreases vascular inflammation and ischemic tissue damage after myocardial infarction and stroke through direct vascular effects involving the nontranscriptional activation of eNOS"

SET Species = "9606"

SET Tissue = "Vascular System"

SET Disease = "Stroke"

a(CHEBI:corticosteroid) -| bp(MESHD:"Inflammation")

Page 9: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 10: Biological Expression Language Overview

BEL Terms

• BEL terms minimally have the following components:– Function

• Required• Can be nested to create complex terms

– Namespace Abbreviation• Optional

– Value• Required• Generally found in the referenced namespace

• BEL terms using values from different namespaces can be equivalenced

10

function(ns:value)

Page 11: Biological Expression Language Overview

BEL Terms

11

a(CHEBI:corticosteroid)

function - abundance()

function - pathology()

path(MESHD:"Insulin Resistance")

Page 12: Biological Expression Language Overview

BEL Terms

12

a(CHEBI:corticosteroid)

Namespace abbreviation - CHEBI

Namespace abbreviation – MESHD

path(MESHD:"Insulin Resistance")

Page 13: Biological Expression Language Overview

BEL Terms

13

a(CHEBI:corticosteroid)

Namespace value

Namespace value

bp(MESHD:"Insulin Resistance")

Page 14: Biological Expression Language Overview

Equivalence of Terms

p(EG:207)

p(SPAC:P31749)

p(HGNC:AKT1)

“the abundance of the protein designated by EntrezGene id

207” (human AKT1)

“the abundance of the protein designated by Swiss-Prot id

P31749” (human AKT1)

“the abundance of the protein designated by HGNC gene

symbol ‘AKT1’” (human AKT1)

Terms are unified during compilation using information in the BEL namespace equivalence documents

Can unify to p(HGNC:AKT1)

in the KAM

Page 15: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 16: Biological Expression Language Overview

BEL Functions

• Types of functions:– Abundances– Processes– Modifications of abundances– Activities– Transformations– List functions

• Abundances and processes are applied directly to namespace values

• All other functions are applied to abundance functions!

Page 17: Biological Expression Language Overview

BEL Functions - Abundances

• Abundances– abundance(), a()– geneAbundance(), g()– rnaAbundance(), r()– microRNAAbundance(), m()– complexAbundance(), complex()– compositeAbundance(), composite()

17

Page 18: Biological Expression Language Overview

abundance(), a()

• Use abundance() to represent any abundances that are not represented by a more specific abundance type, including:– Chemicals

• a(CHEBI:corticosteroid)– Cellular structures

• a(GOCCTERM:"astral microtubule")

• No modification functions apply to abundance terms• Generally, activity functions do not apply to

abundance terms

18

Page 19: Biological Expression Language Overview

geneAbundance(), g()

• Use geneAbundance terms to represent DNA– Can use to represent gene amplification and deletion events– Used in "gene scaffolding"

• g(HGNC:AKT1) transcribedTo r(HGNC:AKT1)– Use in complexes to represent binding to promoters

• complex(p(HGNC:TP53), g(HGNC:CDKN1A))

• In BEL v1.0, the only modification function that can be applied to gene abundances is fusion()– g(HGNC:TMPRSS2,fusion(HGNC:ERG))

• No activity functions apply to geneAbundance terms

19

Page 20: Biological Expression Language Overview

complexAbundance(), complex()

• Use complexAbundance() to represent molecular complexes and binding events

• complexAbundance terms can take two forms:– complexAbundance(ns:value)

• Used for named complexes• E.g., complexAbundance(NCH:"AP-1 Complex")

– complexAbundance(<abundance term list>)• Use to represent binding events or to define complexes by

components• Unordered list• E.g., complex(p(HGNC:FOS),p(HGNC:JUN))

20

Page 21: Biological Expression Language Overview

compositeAbundance(), composite()

• Use to represent cases where multiple abundances synergize to produce an effect– Composite terms should not be used if any of the

abundances alone are reported to cause the effect– Use composite terms only as subjects of statements– E.g., composite(p(HGNC:TGFB1), p(HGNC:IL6))

21

Page 22: Biological Expression Language Overview

BEL Functions - Processes

• Processes include biological phenomena that occur at the level of the cell or organism– biologicalProcess(), bp()

• E.g., bp(GO:"cellular senescence")– pathology(), path()

• E.g., path(MESHD:"Muscle Hypotonia")

22

Page 23: Biological Expression Language Overview

BEL Functions – Abundance Modifications

• Modifications are functions used as arguments within abundance functions

• Currently supported modification types are:– Variants - use to represent protein sequence variants, generally resulting

from a mutation or polymorphism• substitution(), truncation(), fusion()• E.g., p(HGNC:PIK3CA, sub(E, 545, K))

– PIK3CA protein with glutamic acid 545 substituted with a lysine

– Protein Modifications - use to represent post-translational modifications of proteins

• Includes phosphorylation, ubiquitination, acetylation, glycosylation• proteinModification()• E.g., p(HGNC:HIF1A, pmod(H, N, 803))

– Modification of HIF1A by hydroxylation at amino acid asparagine 803

23

Page 24: Biological Expression Language Overview

BEL Functions - Activities

• Activity functions are applied to protein, complex, and RNA abundances to specify the frequency of events resulting from the molecular activity of the abundance– E.g., tport(complex(NCH:"EnaC Complex"))

• Transporter activity of the EnaC sodium channel complex

• This distinction is particularly useful for proteins whose activities are regulated by post-translational modification

• BEL v1.0 supports 10 distinct activity functions:– catalyticActivity, peptidaseActivity, gtpBoundActivity, transportActivity,

chaperoneActivity, transcriptionalActivity, molecularActivity, kinaseActivity, phosphataseActivity, ribosylaseActivity

• molecularActivity() should be used to represent activities that are not represented by a more specific function

24

Page 25: Biological Expression Language Overview

BEL Functions - Transformations

• Transformations are events in which one class of abundance is transformed or changed into a second class of abundance – Translocations

• translocation(), tloc()• cellSecretion(), sec()• cellSurfaceExpression(), surf()

– Reactions• reaction(), rxn()

– Degradation• degradation(), deg()

25

Page 26: Biological Expression Language Overview

translocation(), tloc()

• Use translocation terms to represent the movement of abundances from one cellular location to another

• E.g., tport(complex(NCH:"EnaC Complex")) => \ tloc(a(CHEBI:"sodium(1+)"), MESHCL:"Extracellular Space", \ MESHCL:"Intracellular Space")– The transport activity of the EnaC Complex translocates

sodium ions from extracellular to intracellular

26

Page 27: Biological Expression Language Overview

cellSecretion(), sec()cellSurfaceExpression(), surf()• sec() and surf() are convenience functions for

commonly used translocations

27

Page 28: Biological Expression Language Overview

degradation(), deg()

• Generally used to indicate complete proteolysis of a protein

• Do not use to indicate proteolysis which results in functional cleavage products!

• During compilation Phase I, degradation nodes are linked to the root abundance with a directlyDecreases relationship – E.g., deg(p(HGNC(MAPT))– Compilation adds: deg(p(HGNC:MAPT)) =| p(HGNC:MAPT)

Page 29: Biological Expression Language Overview

BEL Functions – List Functions

• List functions used for:– Protein family assignment

• p(PFH:"Cu-Zn SOD Family") hasMembers list(p(HGNC:SOD1), p(HGNC:SOD3))

– Complex component assignment• complex(GOCCTERM:"gamma-secretase complex") hasComponents \

list(p(HGNC:PSEN1),p(HGNC:NCSTN),p(HGNC:APH1A),p(HGNC:PSEN2))

– Reactants and Products within a reaction term• rxn(reactants(a(CHEBI:superoxide)), \

products(a(CHEBI:"hydrogen peroxide")))

29

Page 30: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 31: Biological Expression Language Overview

BEL Relationships

• Causal relationships– increases, directlyIncreases, decreases, directlyDecreases, rateLimitingStepOf,

causesNoChange• Correlative relationships

– negativeCorrelation, positiveCorrelation, association• Biomarker relationships

– biomarkerFor, prognosticBiomarkerFor• Assignment to groups

– hasMember, hasComponent, hasMembers, hasComponents• Other

– isA, subProcessOf• Genomic relationships

– transcribedTo, translatedTo, orthologousTo

31

Page 32: Biological Expression Language Overview

BEL Relationships – Compiler Inserted Relationships• These relationships are not needed for creating BEL

statements– Used only by the compiler

• actsIn• hasModification• hasProduct• hasVariant• reactantIn• translocates• includes

32

Page 33: Biological Expression Language Overview

Contents

• BEL Statements• BEL Statement Annotations• BEL Terms• BEL Functions• BEL Relationships• General Hints

Page 34: Biological Expression Language Overview

General BEL Hints

• BEL functions, relationships, and namespace values are all case sensitive

• Every term must have a function– Namespace values are always associated with an abundance

or process function– Exception - cellular location values within a translocation

function• Namespace values with spaces or unusual characters

require quotes– E.g., complex(GOCCTERM:"gamma-secretase complex")

34