ekaw - linked data publishing
TRANSCRIPT
Linked Data Publishing Ruben Taelman - @rubensworks
imec - Ghent University
1
Publishing as part of any Linked Data life cycle
Generate Publish
Validate
QueryEnhance2
Publish Query... ...
Linked Data PublishingLinked Data Interfaces
Storage
Non-technical tasks
3
Linked Data PublishingLinked Data Interfaces
Storage
Non-technical tasks
4
Linked Data provides machine-accessible data Machines can retrieve and discover data through HTTP interfaces
Machines can understand the data
5
Ways for publishing Linked Data on the Web
Data dump
1 RDF document
Linked Data document
RDF document per topic
SPARQL endpoint
Expressive query interface
6
Data dumpSimple for data publisher
Data dumps can be large (~gigabytes)
Querying only possible after downloading entire dataset
7
Linked Data documentData is available in smaller fragments, according to subject
Linked Data principle of dereferencing
“3. When someone looks up a URI, provide useful information, using the open Web standards such as RDF, SPARQL” (Hyland 2013)
Querying only possible by traversing links
8
SPARQL endpointRequires higher computational effort from server
Single point to get and expose data
Easily queryable by clients
9
How do the different interfaces relate to each other?
10
Linked Data Fragments (LDF)A uniform view on Linked Data interfaces
high client effort high server effort
Data du
mp
LD do
cumen
t
SPARQL res
ult
11
(Verborgh 2016)
A big unexplored area on the LDF axis
high client effort high server effort
Data du
mp
LD do
cumen
t
SPARQL res
ult
?
12
(Verborgh 2016)
Triple Pattern Fragments (TPF),a trade-off between server and client effort
high client effort high server effort
Data du
mp
LD do
cumen
t
SPARQL res
ult
TPF
13
(Verborgh 2016)
Triple Pattern FragmentsLow-cost server interface
Fragmentation of a dataset by triple patterns
Client-side SPARQL query evaluation using a TPF interface
14
Choosing an LD interface as trade-off between server and client effort
15
URI policies for interfacesLinked Data uses URI’s as a global identification system
URI design principles also apply to interface URI’s:Persistent URI’s and redirectionDomain authority (e.g. government domain)Machine and human-readable representations through content negotiation...
16
Linked Data PublishingLinked Data Interfaces
Storage
Non-technical tasks
17
Interface and storage solution influence each other
18
Start with most restrictive elementStorage is fixed → storage, interface
Machine limitations → interface, storage
19
Storage solutions for Linked Data interfacesData dump
Linked Data document
Triple Pattern Fragments
SPARQL endpoint
RDF file, HDT, ...
Static or dynamic RDF files
RDF file, HDT, SPARQL engine, ...
SPARQL engine
20
Linked Data PublishingLinked Data Interfaces
Storage
Non-technical tasks
LicensingPublication announcementMaintenance
21
Linked Open Data requires an open licenseAll published data should have a connected license
Features of openness: (Open Knowledge Foundation)
Availability and accessReuse and redistributionUniversal participation
Popular open license: CC0
Mention license in dataset listings and in metadata
Confidential data might require restrictive license and security
22
Announcing to the publicCommunication channels: mailing lists, blogs, newsletters, …
Feedback channel: form or contact address for any issues
Centralized repositories (e.g. https://datahub.io)
Automated discovery with metadata (e.g. DCAT, VOID)
23
Linked Data Publication is a continuous processSocial contract with data consumers
Avoid dataset / interface removal
Data can change
Movement of dataset to new location → URI persistence!
Responsible entity behind feedback channel
24
Linked Data PublishingLinked Data Interfaces
Storage
Non-technical tasks
25
ConclusionsDifferent Linked Data interfaces exist for publishing Linked Data
Trade-off between server and client effort
Interface and storage solution influence each other
Properly license, announce and maintain your data
26
SourcesR Verborgh “Linked Data Publishing” http://rubenverborgh.github.io/WebFundamentals/linked-data-publishing/
Hyland B, Atemezing G, Villazón-Terrazas B. “Best Practises for Publishing Linked Data”https://www.w3.org/TR/ld-bp/
Berners-Lee, Tim. "Linked data, 2006." (2006). https://www.w3.org/DesignIssues/LinkedData.html
Villazón-Terrazas, Boris, et al. "Methodological guidelines for publishing government Linked Data." Linking government data. Springer New York, 2011. 27-49.
27