semantic search on heterogeneous wiki systems - poster

1
Semantic Search on Heterogeneous Wiki Systems Fabrizio Orlandi, Alexandre Passant Acknowledgements The work presented in this poster has been funded in part by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2) and by an IRCSET Scholarship. Motivations Wikis are widely used both on the Web - with well-known systems such as the Wikipedia - and in the workplace, for instance for project management or customer relationships. However, each wiki system relies on its own data structure and API to model its data and let developers access it. They act as isolated systems, where information from one system cannot be easily integrated with information from another one. This introduces several drawbacks when users need to access information on the Web or in the enterprise. We propose an approach based on Semantic Web technologies and Linked Data principles to solve such issues and to enable semantic search across heterogeneous wiki systems. Abstract We developed a system to enable semantic search across heterogeneous wikis in a unified way using Semantic Web technologies. In particular: I) we designed a common model for representing social and structural wiki features; II) we extracted semantic data from wikis using two relevant wiki engines; III) we built and efficient application with a simple user-interface enabling semantic searching and browsing capabilities on the top of different interlinked wikis. Our Contributions A common semantic model for representing wiki structure and contributions in RDF - Resource Description Framework - encompassing previous models in the area Various data exporters for popular wiki systems, translating wiki information in RDF annotations (based on our model) in real-time: a webservice for MediaWiki's wikis and a plug-in for DokuWiki A semantic search engine which provides means to retrieve information contained in heterogeneous wikis in a novel and user-friendly way. It allows for semantic searching and faceted browsing capabilities on the top of the collected data Activity diagram of the SIOC-MediaWiki webservice exporter Pages versioning model with SIOC properties. Please note that transitive properties earlier_version and later_version are only displayed for the latest and the first wiki article. Results In total, we collected about 1GB of RDF data and loaded it in the Sesame RDF store. The total number of triples is around 45,500, with 3,400 wiki articles and 700 users. Results for every query are returned in less than 3 seconds, and the system is capable to reply to queries like: What are the co-authors of user X and on which articles they collaborate?What are the topics and the wiki sites the user X contributed most in the past six months?References F. Orlandi, A. Passant, “Enabling cross-wikis integration by extending the SIOC ontology”, In Proceedings of the Fourth Workshop on Semantic Wikis (SemWiki2009). F. Orlandi, “Using and extending the SIOC ontology for a fine-grained wiki modeling”, Master's thesis, 2009. Conclusion Despite its clean interface, the presented application allows for advanced and fast querying processes and hidden knowledge discovery, showing potentialities that cannot be obtained using the traditional Web 2.0 instruments. Hence we showed an overall benefit on applying Semantic Web technologies to wikis, enabling users to access the information generated by this process in a simple and transparent way.

Upload: fabrizio-orlandi

Post on 28-Aug-2014

770 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Semantic Search on Heterogeneous Wiki Systems - poster

Semantic Search on Heterogeneous Wiki SystemsFabrizio Orlandi, Alexandre Passant

AcknowledgementsThe work presented in this poster has been funded in part by Science Foundation Ireland under Grant No. SFI/08/CE/I1380 (Líon-2) and by an IRCSET Scholarship.

MotivationsWikis are widely used both on the Web - with well-known systems such as the Wikipedia - and in the workplace, for instance for project management or customer relationships. However, each wiki system relies on its own data structure and API to model its data and let developers access it. They act as isolated systems, where information from one system cannot be easily integrated with information from another one. This introduces several drawbacks when users need to access information on the Web or in the enterprise. We propose an approach based on Semantic Web technologies and Linked Data principles to solve such issues and to enable semantic search across heterogeneous wiki systems.

AbstractWe developed a system to enable semantic search across heterogeneous wikis in a unified way using Semantic Web technologies. In particular: I) we designed a common model for representing social and structural wiki features;II) we extracted semantic data from wikis using two relevant wiki engines; III) we built and efficient application with a simple user-interface enabling semantic searching and browsing capabilities on the top of different interlinked wikis.

Our Contributions

A common semantic model for representing wiki structure and contributions in RDF - Resource

Description Framework - encompassing previous models in

the area

Various data exporters for popular wiki systems, translating wiki information in RDF annotations (based on our model) in real-time: a webservice for MediaWiki's wikis and a plug-in for DokuWiki

A semantic search engine which provides means to retrieve

information contained in heterogeneous wikis in a novel and

user-friendly way.It allows for semantic searching

and faceted browsing capabilities on the top of the collected data

Activity diagram of the SIOC-MediaWiki webservice exporter

Pages versioning model with SIOC properties. Please note that transitive properties earlier_version and later_version are only displayed for the latest and the first wiki article.

ResultsIn total, we collected about 1GB of RDF data and loaded it in the Sesame RDF store. The total number of triples is around 45,500, with 3,400 wiki articles and 700 users.Results for every query are returned in less than 3 seconds, and the system is capable to reply to queries like:

“What are the co-authors of user X and on which articles they collaborate?”

“What are the topics and the wiki sites the user X contributed most in the past six months?”

References• F. Orlandi, A. Passant, “Enabling cross-wikis integration by extending the SIOC ontology”, In Proceedings of the Fourth Workshop on Semantic Wikis (SemWiki2009).• F. Orlandi, “Using and extending the SIOC ontologyfor a fine-grained wiki modeling”, Master's thesis, 2009.

ConclusionDespite its clean interface, the presented application allows for advanced and fast querying processes and hidden knowledge discovery, showing potentialities that cannot be obtained using the traditional Web 2.0 instruments. Hence we showed an overall benefit on applying Semantic Web technologies to wikis, enabling users to access the information generated by this process in a simple and transparent way.