bio2rdf@bh2010
TRANSCRIPT
Agenda
The problem
What is RDF ?
The vision
What is know about hexokinase ?
A new approche: The Cognoscope
http://www.pcworld.idg.com.au/article/132245/berners-lee_seeks_killer_app_semantic_web
"Similarly, if we could get critical mass in life sciences, if we get a half a dozen or a dozen set of ontologies, the core ones for drug discovery out there, then suddenly the Semantic Web within life sciences would have a critical mass. It'll snowball much more rapidly and it will be copied. Other areas will realize: Oh it's worth investing in this,"
Tim Berners-LeeWWW inventor
http://www.biopax.org/Docs/2004-10-28_SWLS-SessionVII.pdf
http://informationarchitects.jp/ia-trendmap-2007v2/
Web Trend Map 2007
The proposed solution
Bio2RDF solve the problem of data integration in bioinformatics by applying the Semantic Web approach based on RDF, OWL and SPARQL technologies.
"Wouldn't it be great if you were able to organize all this information based on your own terms, instead of based on the application you use to access the information ?”
Ramanathan V. GuhaRDF initiator
http://cgi.netscape.com/columns/techvision/innovators_rg.html
The same in RDF/XML
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:exterms="http://www.example.org/terms/">
<rdf:Description rdf:about="http://www.example.org/index.html"> <exterms:creation-date>August 16, 1999</exterms:creation-date> </rdf:Description></rdf:RDF>
The same in NTRIPLES
<http://www.example.org/index.html> <http://www.example.org/terms/creation-date> “August 16, 1999” .
It is a technology stack
http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/
It is a distributed architecture
http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/
Linked Data cloud evolution
http://linkeddata.org/http://esw.w3.org/topic/TaskForces/CommunityProjects/LinkingOpenData/DataSets/Statistics
Linked data cloudin March 2009
Linked data cloudin May 2007
http://www.ted.com/talks/tim_berners_lee_on_the_next_web.html
LODD wins the 2009 Triplify challenge
http://triplify.org/files/challenge_2009/LODD.pdf
http://bio2rdf.wiki.sourceforge.net/Banff%20Manifesto
Actual Architecture 2010
Offline rdfising process Virtuoso SPARQL endpoints network Namespace resolution through DNS subdomain
Bio2RDF has 3 mirror sites
http://cu.bio2rdf.org/http://qut.bio2rdf.org/http://quebec.bio2rdf.org/
Main REST services
Describe a ressource by a dereferencable URI
http://bio2rdf.org/ns:id
Global services over federated endpoints
http://bio2rdf.org/links/ns:id
http://bio2rdf.org/search/searchedTerm
Targeted services to a specific endpoint
http://bio2rdf.org/linksns/ns2/ns:id
http://bio2rdf.org/searchns/ns/searchedTerm
The mashup principle
To answer a complex question we first need to build a specific database, a mashup, to which we submit the appropriate query.
Cognoscope new definition
A Cognoscope is an instrument to explore and collect topics from the Linked Data cloud of SPARQL endpoints. It permits the querying over a distributed network of knowledge resource.
Cognoscope definition
The magnifying effect depends of the density of links between resource (entity links), which is a by-product of the human intellectual activity in the social network.
The filtering effect is based on the inherent semantic of RDF graph described using types and predicates.
Facet browsing is used to zoom in and out in the observed graph.
Full text search is used to discover concept.
Cognoscope function
How can we submit a complex query over the network of SPARQL endpoints ?
By using a workflow fetching individual SPARQL endpoints.
We use a workflow to build the mashup.
Bio2RDF Cognoscope architecture
Linked Data cloud of SPARQL endpoints
TriplestoreVirtuoso 6
Workflow engine
Taverna 2.1
By building a mashup with Taverna
Write your complex SPARQL query as if a global graph would be available
Identify the needed namespaces and split the query to fetch each data source separetly
Build a mashup using a Taverna workflow that instanciate a local triplestore
Execute your complex query locally on the mashup
Where to get Bio2RDF Cognoscopehttp://www.myexperiment.org/search?query=cognoscope
Bio2RDF SPARQL endpointshttp://delicious.com/tag/bio2rdf:sparql
Thanks
The Bio2RDF community
Centre de recherche du CHUL
Dumontier Lab
QUT eResearch Center
The software provider
Openlink Virtuoso
Taverna community
My colleagues
Marc-Alexandre Nolin
Michel Dumontier
Peter Ansell