microblogging: a semantic web and distributed approach

24
Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.ie Microblogging: A Semantic Web and Distributed Approach Alexandre Passant 1 , Tuukka Hastrup 2 , Uldis Bojārs 2 , John Breslin 2 1 LaLIC, Université Paris-Sorbonne 2 Digital Enterprise Research Institute, National University of Ireland, Galway Scripting For the Semantic Web (SFSW2008) Tenerife, Spain, 2008-06-02

Upload: alexandre-passant

Post on 05-Jul-2015

16.881 views

Category:

Technology


0 download

DESCRIPTION

SFSW2008, 2nd June 2008, Tenerife - http://www.semanticscripting.org/SFSW2008/

TRANSCRIPT

Page 1: Microblogging: A Semantic Web and Distributed Approach

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Microblogging:A Semantic Web and Distributed Approach

Alexandre Passant1, Tuukka Hastrup2, Uldis Bojārs2, John Breslin2

1 LaLIC, Université Paris-Sorbonne2 Digital Enterprise Research Institute, National University of Ireland, Galway

Scripting For the Semantic Web (SFSW2008)Tenerife, Spain, 2008-06-02

Page 2: Microblogging: A Semantic Web and Distributed Approach

Microblogging overview

• Sweet spot between blogging and instant messaging• Short status notification updates

– Share your real life with others !

2

Page 3: Microblogging: A Semantic Web and Distributed Approach

Why and how does it work ?

• A ubiquitous network of communication– Various communication channels: Web, phone messages, e-mail– Simple approach for publishing data, following and replying

• A fluid network for information exchange in real-time

• Services– Online platforms: Twitter, Jaiku, Pownce ...– Plug-ins for existing services: Prologue for WordPress

• Microblogging in organisations ?– Corporate microblogging: real-time Q&A– Extends the Enterprise 2.0 vision

• Internal Signals (SLATES)

3

Page 4: Microblogging: A Semantic Web and Distributed Approach

Issue #1: Data ownership and portability

• A centralised approach– Need to register to (one more!) social platform– “Social Network Fatigue”– Users of different services cannot communicate– Would you use webmail that only allows you to send mail to

people using the same provider ?

• Users do not own the content they publish– It belongs to a proprietary and closed service– What if it closes ? How do I move my data between services ?– Would you register to a webmail that does not provide POP or

SMTP ?

• Users do not own their social network– And cannot reuse existing ones: invite, again, again and again ...

• Yet, Twitter provides XFN export of people you follow

4

Page 5: Microblogging: A Semantic Web and Distributed Approach

Issue #2: Meta-data

• Lack of unified, machine-readable meta-data– Unified queries over a set of services ?

• All microblog content posted ten days ago ?– APIs ?

• For each service, a new API must be learnt

• Extract machine-readable meta-data from Twitter– Merge RSS feeds with XML export available for each update– Map result data with Semantic Web vocabularies

• Dublin Core, SIOC...– Use Sindice / SWSE to guess URIs of people

• From a user name to a FOAF URI (as in SWAML)

– A complex process, latest updates only (RSS-based)

5

Page 6: Microblogging: A Semantic Web and Distributed Approach

Issue #3: Content semantics

• Lack of semantics in status updates– Updates dealing with programming languages ?– What happens in my neighbourhood ?

• Want to extend meta-data– Locations the post talks about

• Hash tags ? Lead to the same issues as tagging– Ambiguity

• #paris ? #swig ?– Heterogeneity

• #semweb, #websemantique– Lack of organisation

• How to relate #rdfa and #semanticweb• Which tags to follow if I’m interested in SW ?

6

Page 7: Microblogging: A Semantic Web and Distributed Approach

Our approach to microblogging

• Goal: To provide an open and flexible alternative to current microblogging systems– Distributed, open, user-controlled, reusable, scalable, based on

standards

• Means: The Semantic Web !– SIOC and FOAF as the main vocabularies– Semantics for both meta-data and status content– Linked Data principles

• Proof of concept: SMOB– Open-source software for distributed microblogging– An ecosystem of distributed publishers and aggregators

7

Page 8: Microblogging: A Semantic Web and Distributed Approach

A common model for meta-data

• Modelling users (physical persons) with FOAF– Friend Of A Friend– Ability to reuse one’s personal profile created from an external

application (LiveJournal, Flickr exporter ...)– Interlinking various profile URIs on the Web using Linked Data

principles

• Modelling accounts and data with SIOC– Semantically-Interlinked Online Communities– Linking an existing FOAF profile to an online account, instead of

creating yet another disconnected one– Extended with Microblog and MicroblogPost classes

• Subclasses of Container and Item– Use other SIOC / DC properties to model the data

8

Page 9: Microblogging: A Semantic Web and Distributed Approach

FOAF + SIOC: Semantics for data portability

9

Page 10: Microblogging: A Semantic Web and Distributed Approach

Post example with the Tabulator

• @@@@@@@@@@

10

Page 11: Microblogging: A Semantic Web and Distributed Approach

Modelling content of status updates

• URIs instead of hash tags– Uniform description of resources (DBpedia ...)– Modelled using sioc:topic between the content and the URI

• Microblogging enters the Linked Data Web !– Need to find a user-friendly way to bridge this gap

• Prefixed hash tags– #dbp:Effeil_Tower - Simple DBpedia mapping

• http://dbpedia.org/resource/Effeil_Tower

– #geo:Paris,France - Using geonames.org webservice• Querying the service to retrieve location URI

• Can be used in lookup services such as Sindice– New ways to discover content

11

Page 12: Microblogging: A Semantic Web and Distributed Approach

A distributed architecture

• Vision: Open, distributed– Follow the spirit of the Web architecture– A network of publishing services and aggregation servers

interacting with each other– A microblogging ecosystem– New providers or aggregators can be added at any time,

anywhere on the network– Provide standards, methods and open-source tools rather than a

closed proprietary approach

12

Page 13: Microblogging: A Semantic Web and Distributed Approach

Architecture overview

13

Page 14: Microblogging: A Semantic Web and Distributed Approach

Data ownership

• Publisher stores its content locally, then provide it to aggregators which cache it in a triple store– Data belongs to the user– If an aggregator closes, data is still there– Available in RDF: Mashable, browsable, linkable ...– Can be combined with other Social Media Contributions modeled

using SIOC• Retrieve all blog posts and microblogging updates of the last week

• Focusing on ideas from “A bill of rights for the Social Web”– http://opensocialweb.org/2007/09/05/bill-of-rights/– Ownership, Control, Freedom

14

Page 15: Microblogging: A Semantic Web and Distributed Approach

SMOB: A prototype for semantic microblogging

• SMOB– http://smob.sioc-project.org– Open-source client and server software to demonstrate

principles of our approach– Early stage of development

• First prototype in a day and very few lines of PHP– Still a prototype, some challenges to be achieved:

• Scalability• SPARQL query complexity on the server side• Authentication

• A public SMOB aggregator and anonymous publishing client deployed– 3 weeks, 10 users, 90 posts

15

Page 16: Microblogging: A Semantic Web and Distributed Approach

Publishing content with SMOB

• Reusing your FOAF profile– Creating RDF data using the SIOC PHP API

• Publishing to various aggregators– Twitter integration, promote SW by using it for your tweets !

16

Page 17: Microblogging: A Semantic Web and Distributed Approach

Browsing local content

• Listing of latest updates, embeds RDFa

17

Page 18: Microblogging: A Semantic Web and Distributed Approach

Storing aggregated content in SMOB server

• Aggregators receive pings and cache the RDF documents in real-time

• Hash tag interpretation with regular expressions– geonames.org wrapper for #geo: tags– DPpedia links for #dbp: tags

• Based on the ARC2 API for storage / queries and Exhibit for the browsing interface– SPARUL “LOAD” pattern to get data– SPARQL to format data to Exhibit JSON– Exhibit for faceted browsing

18

Page 19: Microblogging: A Semantic Web and Distributed Approach

SPARQL query example

• Retrieve latest updates from the server (uniquify in PHP)

SELECT ?post ?date ?content ?maker ?name ?depiction WHERE { ?post rdf:type sioct:MicroblogPost ; foaf:maker ?maker ; sioc:content ?content ; dct:created ?date . ?maker foaf:name ?name . { ?maker foaf:img ?depiction } union { ?maker foaf:depiction ?depiction } } ORDER BY DESC(?date) LIMIT 20

19

Page 20: Microblogging: A Semantic Web and Distributed Approach

Faceted browsing

20

Page 21: Microblogging: A Semantic Web and Distributed Approach

Faceted browsing with geolocation

21

Page 22: Microblogging: A Semantic Web and Distributed Approach

Security, privacy, authentication

• We currently limit access to publishing, aggregation and content viewing by HTTP authentication and API keys– IP-based authentication using .htaccess– Global API key for a microblogging aggregator

• All updates are public on the client side

• TODO– Authentication schemes (OAuth, OpenID)– Private updates and private communities

22

Page 23: Microblogging: A Semantic Web and Distributed Approach

Future works

• More meta-data– Process hash tags before publishing RDF

• Linked Data from the client-side• Tags / URIs relationships with MOAT

– @replies, linked to FOAF URIs

• Other issues– Scalability, authentication, timezones

• Intelligent agregators– Browse the SIOC-o-sphere to find relevant updates– Based on their content:

• A music aggregator, retrieving only data linking to music bands URIs

• Deployment within organisations– Corporate Microblogging in SIOC-based companies

23

Page 24: Microblogging: A Semantic Web and Distributed Approach

Thank you !

• Contacts– http://smob.sioc-project.org– #smob IRC channel on Freenode– sioc-dev on google-groups

• SDoW2008– Social Data on the Web workshop @ ISWC2008

24