including co-referent uris in a sparql query
Post on 29-Aug-2014
767 Views
Preview:
DESCRIPTION
TRANSCRIPT
Including Co-referent URIs in a SPARQL Query
Christian Y A Brenninkmeijer, Carole Goble, Alasdair J G Gray, Paul Groth,
Antonis Loizou, and Steve Pettifer
www.openphacts.org A.J.G.Gray@hw.ac.uk@open_phacts @gray_alasdair
Multiple Identities
Andy Law's Third Law“The number of unique identifiers assigned to an individual is never less than the number of Institutions involved in the study”
http://bioinformatics.roslin.ac.uk/lawslaws.html
22/10/2013 COLD 2013 2
P12047X31045P120
47
GB:29384RS_
2353
Are these the same thing?
Gleevec® = Imatinib Mesylate
22/10/2013 COLD 2013 3
DrugbankChemSpider PubChem
Imatinib
MesylateImatinib MesylateYLMAHDNUQAMNNX-UHFFFAOYSA-N
22/10/2013 COLD 2013 4
22/10/2013 COLD 2013 5
Multiple Links: Different Reasons
22/10/2013 COLD 2013 7
Link: skos:closeMatchReason: non-salt form
Link: skos:exactMatchReason: drug name
Strict Relaxed
Analysing Browsing
Dynamic Equality
22/10/2013 COLD 2013 8
skos:exactMatch(InChI)
Strict Relaxed
Analysing Browsing
Dynamic Equality
22/10/2013 COLD 2013 9
skos:closeMatch(Drug Name)
skos:closeMatch(Drug Name)
skos:exactMatch(InChI)
Open PHACTS Discovery Platform
22/10/2013 COLD 2013 10
Drug Discovery Platform
Apps
Domain API
Interactive responses
Production qualityintegration platform
MethodCalls
Integration Approach
• Data kept in original model• Data cached in central triple store• API call translated to SPARQL query• Query expressed in terms of original data
22/10/2013 COLD 2013 11
OPS Discovery Platform
RDFNanopub
Db
VoID
Data Cache (Virtuoso Triple Store)
Semantic Workflow Engine
Linked Data API (RDF/XML, TTL, JSON)DomainSpecificServices
Identity Resolution
Service
Chemistry RegistrationNormalisation & Q/C
IdentifierManagement
Service
Indexing
Cor
e Pl
atfo
rm
P12374EC2.43.4
CS4532
“Adenosine receptor 2a”
RDF
VoID
Db
RDFNanopub
Db
VoID
RDF
Db
VoID
RDFNanopub
VoID
Public Content Commercial
Public Ontologies
User Annotations
Apps
22/10/2013 COLD 2013 12
Platform Interaction
1. Resolve user input: – User enters search text– Resolve to a URI for concept
2. Request data for URI– Expand URI to equivalent for each dataset– Run resulting SPARQL query
22/10/2013 COLD 2013 13
?iri cheminf:logd ?logd .FILTER (?iri = cw:979b545d-f9a9 || ?iri = cs:2157 || ?iri = chembl:1280 || ?iri = db:db00945 )
cw:979b545d-f9a9 cheminf:logd ?logd .GRAPH <http://rdf.chemspider.com> {
}
cw:979b545d-f9a9 cheminf:logd ?logd .
Query Expansion
22/10/2013 COLD 2013 14
Identity Mapping Service
(BridgeDB)
Query Expander Service
Profiles
Mappings
Q, L1 Q’
[cw:979b545d-f9a9,cs:2157, chembl:1280,db:db00945]
cw:979b545d-f9a9, L1
Can also be achieved through UNION
Experiment
Is it feasible to use a stand-off mapping service?• Base lines (no external call):
– “Perfect” URIs– Linked data querying
• Expansion approaches (external service call):– FILTER by Graph– UNION by Graph
22/10/2013 COLD 2013 15
“Perfect” URI BaselineWHERE { GRAPH <chemspider> { cs:2157 cheminf:logp ?logp . } GRAPH <chembl> { chembl_mol:m1280 cheminf:mw ?mw . }}
22/10/2013 COLD 2013 16
Linked Data BaselineWHERE { GRAPH <chemspider> { cs:2157 cheminf:logp ?logp . } GRAPH <chembl> { ?chemblid cheminf:mw ?mw . } cs:2157 skos:exactMatch ?chemblid .}
22/10/2013 COLD 2013 17
Queries
Drawn from Open PHACTS API:1. Simple compound information (1)2. Compound information (1)3. Compound pharmacology (M)4. Simple target information (1)5. Target information (1)6. Target pharmacology (M)
22/10/2013 COLD 2013 18
Queries
Drawn from Open PHACTS API:1. Simple compound information (1)2. Compound information (1)3. Compound pharmacology (M)4. Simple target information (1)5. Target information (1)6. Target pharmacology (M)
22/10/2013 COLD 2013 19
Data:167,783,592 triples
Mappings:2,114,584 triples
Lenses:1
Datasets and Links
22/10/2013 COLD 2013 20
Average execution times
22/10/2013 COLD 2013 21
Average execution times
22/10/2013 COLD 2013 22
0.01
8
22/10/2013 COLD 2013 29
Conclusions
• Query expansion slower in general– Due to separate service call– Difference below human perception– UNION faster than FILTER on Virtuoso
• Stand-off mappings feasible• Infrastructure can support lenses
22/10/2013 COLD 2013 30
Strict Relaxed
Analysing Browsing
Questions
A.J.G.Gray@hw.ac.ukwww.macs.hw.ac.uk/~ajg33@gray_alasdair
Open PHACTS Project
pmu@openphacts.orgwww.openphacts.org@open_phacts
top related