talk1 ben sadi for_gmod_bosc_2011
DESCRIPTION
TRANSCRIPT
![Page 1: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/1.jpg)
SADI for GMOD: Bringing Model Organism Databases onto the Semantic Web
Ben Vandervalk, Luke McCarthy, Edward Kawas, Mark Wilkinson
James Hogg Research Centre, Heart + Lung InstituteUniversity of British Columbia
http://code.google.com/p/sadi/wiki/SADIforGMOD
![Page 2: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/2.jpg)
SADI for GMOD: BackgroundSADI (Semantic Automated Discovery and Integration)
• Standard for Web services that consume/generate RDF
• Motivation: automated integration of bioinformatics data and software
GMOD (Generic Model Organism Database)
• Toolkit for building a model organism database and website
• Collection of related open source projects: e.g. Chado, Gbrowse, Pathway Tools
• Many sites use GMOD components: FlyBase, BeetleBase, DictyBase, etc.
![Page 3: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/3.jpg)
SADI in a Nutshell• to invoke a SADI service:
o HTTP POST an RDF document to the service URIo e.g. $ curl --data-binary @input.rdf http://sadiframework.org/examples/hello
• to get service metadata: o HTTP GET on service URLo returns an RDF document with service name, description, etc. o e.g. $ curl http://sadiframework.org/examples/hello
• structure of input/output data is described in OWLo service provider specifies one input OWL class and one output OWL class
• strengths of SADIo no framework-specific messaging formats or ontologieso supports batch processing of inputso supports long-running services (asynchronous services)
more info: http://sadiframework.org/
![Page 4: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/4.jpg)
SADI for GMOD• SADI services for accessing sequence feature data• implemented as Perl CGI scripts
Service Name Input Relationship Output
get_feature_info database identifier is about feature description
get_features_overlapping_region
genomic coordinates overlaps collection of feature descriptions
get_sequence_for_region
genomic coordinates is represented byDNA, RNA, or amino
acid sequence
get_child_features feature description has part / derives intocollection of feature
descriptions
get_parent_feature feature description is part of / derives from
collection of feature descriptions
![Page 5: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/5.jpg)
SADI for GMOD: Structure of Service Input/Output RDF
@prefix lsrn: <http://purl.oclc.org/SADI/LSRN/> .@prefix GeneID: <http://lsrn.org/GeneID:> .
GeneID:49962 a lsrn:GeneID_Record; sio:SIO_000008 [ # p = 'has attribute' a lsrn:GeneID_Identifier; sio:SIO_000300 "49962" # p = 'has value' ] .
@perefix lsrn: <http://purl.oclc.org/SADI/LSRN/> .@prefix GeneID: <http://lsrn.org/GeneID:> .@prefix FlyBase: <http://flybase.org/cgi-bin/sadi.gmod/feature?id=> .@prefix GenBank: <http://lsrn.org/GB:> .
# p = 'is about'GeneID:49962 sio:SIO_000332 FlyBase:FBgn0040037 .
# feature
FlyBase:FBgn0040037 a SO:SO_0000704 . # o = 'gene' range:position [ a range:RangedSequencePosition; sio:SIO_000053 . # p = 'has proper part' [ a range:StartPosition; sio:SIO_000300 26994]; sio:SIO_000053 . # p = 'has proper part' [ a range:EndPosition; sio:SIO_000300 32391]; range:in_relation_to _:minus_strand_seq ] .
_:minus_strand_seq sio:SIO_000011 [ # p = 'represents' a strand:MinusStrand; sio:SIO_000093 GenBank:AE014135 # p = 'is proper part of' ] .
# reference feature (chromosome)
FlyBase:4 # chromosome 4 a SO:SO_0000105 . # o = 'chromosome arm'
Input RDF (N3) Output RDF (N3)
get_feature_info
HTTP POST
![Page 6: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/6.jpg)
[GENERAL]db_adaptor = Bio::DB::SeqFeature::Storedb_args = -adaptor DBI::mysql -dsn dbi:mysql:database=flybasebase_url = http://flybase.org/cgi-bin/sadi.gmod/
SADI for GMOD: Setting up the Services1. Load your GFF files into a Bio::DB::SeqFeature::Store database (mysql) 2. Install SADI for GMOD dependencies with CPAN
3. Download the SADI for GMOD tarball and unpack into cgi-bin
4. Set DB connection parameters in cgi-bin/sadi.gmod/sadi.gmod.conf
5. Configure Dbxref mappings in cgi-bin/sadi.gmod/dbxref.conf
[DBXREF_TO_LSRN]SwissProt = UniProtUniProtKB = UniProtSwissProt/TrEMBL = UniProt...
6. Register the services in public SADI registry: http://sadiframework.org/registry
more info: http://code.google.com/p/sadi/wiki/SADIforGMOD
![Page 7: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/7.jpg)
SADI Client Software
SADI Taverna PluginSHARE Query Engine
http://biordf.net/cardioSHARE/query
SPARQL Query => SADI Workflow Design SADI workflows
http://sadiframework.org/content/2010/05/03/sadi-taverna-plugin-tutorial/
![Page 8: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/8.jpg)
Acknowledgements
TeamMark Wilkinson: Principal InvestigatorLuke McCarthy: Lead Programmer, SADI & SHAREEdward Kawas: Perl Programmer, SADI
FundingMicrosoftResearch
http://sadiframework.org/
![Page 9: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/9.jpg)
Extra Slides
![Page 10: Talk1 ben sadi for_gmod_bosc_2011](https://reader035.vdocuments.site/reader035/viewer/2022073017/54b3e30d4a7959ac5f8b4624/html5/thumbnails/10.jpg)
Demo with SHARE Query Engine
SPARQL Query SADI Workflow
"What proteins are homologous to FlyBase protein FBpp0288804?"
PREFIX FlyBase: <http://lsrn.org/FLYBASE:>PREFIX sio: <http://semanticscience.org/resource/>
SELECT ?homologWHERE { # SIO_000332 = 'is about' FlyBase:FBpp0288804 sio:SIO_000332 ?protein . # SIO_000205 = 'is represented by' ?protein sio:SIO_000205 ?sequence .
# SIO_010302 = 'is homologous to' ?protein sio:SIO_010302 ?homolog .
}
online demo: http://biordf.net/cardioSHARE/query