DReSS
Engineering a Replay Application Based on RDF and OWL
Chris Greenhalgh, Andy French, Jan Humble, Paul Tennent
School of Computer Science, University of Nottingham
email: {cmg, apf, jch, pxt}@cs.nott.ac.uk
DReSS
Content
Introduction to DRS Persistence and state in DRS Using RDF and OWL within DRS Performance strategies and issues Conclusions and Future Work
DReSS
Introduction to DRS
DReSS
Persistence and State in DRS
RDBMS
JENA RDF ModelsMedia & log files
File-backed HSQL DBsIndex Project(s)
DRS GUI
DReSS
Persistence and State in DRS
JENA RDF ModelsMedia & log files
File-backed HSQL DBsIndex Project(s)
Highly structured log file events
Video, Audio,Images,
Documents
Operational(meta)data,
Annotations, Other metadata
DReSS
OWL Ontologies
digitalrecord.owl– core digital record,– E.g. Media (files), Projects, “Analyses”, People, Annotations,
Times, Timelines, Activities
replaytool2.owl– DRS configuration and security,
guiconfiguration.owl – new GUI configuration options, and
logfileworkbench.owl – for working with system log files and databases– E.g. databases, representations of time, table column types
DReSS
Implementation: Ontology wrapper classes
Wrappergen tool– Reads ontology– Generates Java:
• Interface hierarchy (multiple inheritance)
• JENA & JavaBean implementation hierarchies (single inheritance)
– Type-safe Java programming
– Limited query support• Still use SPARQL
– Open Source (DRS CVS)
+getResource()+getModel()
«interface»Thing
+getHasMimeType()+setMasMimeType()+isSetHasMimeType()+unsetHasMimeType()+addHasMimeType()+removeHasMimeType()
«interface»Media
«interface»Video
MediaImpl
VideoImpl
ThingImpl
DReSS
Performance strategies (1)
Horses for courses – RDF, files and databases– For (meta)data, media and processed log-file data/events
Divide and conquer – top-level division into “projects”– Separate JENA models for each
Deliberately limited inference– In (separate) ontology model only– RDFS entailments only– Requires explicit expansion of queries, e.g. instances of a
class (including instances of subclasses)
DReSS
Performance strategies (2)
Model cacheing– JENA Monitor model tracks
changes– Incremental changes flushed by
explicit user action (save/close)– Possible building block for
undo/redoRDB
Persistent model
Monitor model
In-memory model
Statements added/removed
Save/exit
JDBC
API
DReSS
Performance issues
Choice of RDBMS with JENA– MySQL
• Requires installation and configuration
• 25ms/statement insert with duplicate checking
• 0.75ms/statement insert without duplicate checking
– HSQLDB• Runs embedded and file-backed – no installation
• 0.22ms/statement insert with duplicate checking, 2170 statements
• 1.22ms/statement insert with duplicate checking, 13000 statements
• 0.33ms/statement insert without duplicate checking, 15000 statements
DReSS
Conclusions and Future Work
Effective combination of RDF, files and databases– Various performance strategies– Not yet validated with very large projects…
Reasonable software development support– Wrapper classes, JENA API, SPARQL queries
Beginnings of ontology-driven interface elements– Big usability challenges
Work-group server support – in progress– CVS-like RDF check-out/check-in approach for intermittently
networked collaboration
Open Source release, ongoing development/support
DReSS
Acknowledgements
This work was supported by the ESRC through the grant “Understanding New Forms of Digital Record for E-Social Science” (the DReSS node of the NCeSS) and by the EPSRC through grant EP/C010078/1, “Semantic Media - Pervasive Annotation for e-Research” and the EQUATOR IRC, grant GR/N15986/01.
With thanks to our collaborators in those projects. With thanks to the Thrill project.
DReSS
Release 1: http://thedrs.sourceforge.net
DReSS
Announcing the Digital Replay System Release 1
Supports the coordinated replay, annotation and analysis of combinations of video, audio, transcripts, images and system log files.
Requires Windows or Mac OSX, Java 1.5+ and Apple QuickTime.
Free and Open Source – http://thedrs.sourceforge.net
Multiple Videos
TimelineLog data
Transcript