signposting for repositories
Post on 22-Jan-2018
400 Views
Preview:
TRANSCRIPT
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
Signposting for Repositories
Martin Klein@mart1nkle1n
http://orcid.org/0000-0003-0130-2097
Herbert Van de Sompel@hvdsomp
http://orcid.org/0000-0002-0715-6126
Research Library
Los Alamos National Laboratory
http://signposting.org
Signposting is funded by the Andrew W. Mellon Foundation
Cartoon by Patrick Hochstenbach
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
2
10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
3
10.1594/PANGAEA.867908
http://dx.doi.org/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
4https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
5https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
6https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
7https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
8https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
9https://doi.pangaea.de/10.1594/PANGAEA.867908
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
10
Problems
• Humans can easily navigate such links i.e.,
• Copy the DOI and resolve it via dx.doi.org
• Determine where the bibliographic resources are
• Interpret the download link for the dataset ZIP
• Search for authors’ names on the web
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
11
Problems
• Humans can easily navigate such links i.e.,
• Copy the DOI and resolve it via dx.doi.org
• Determine where the bibliographic resources are
• Interpret the download link for the dataset ZIP
• Search for authors’ names on the web
• Machines can’t do any of this!
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
12
HTTP Links
Mark Nottingham (2010) RFC5988: Web Linking.
https://tools.ietf.org/rfc/rfc5988.txt
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
13
HTTP Links
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
14
HTTP Links
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
15
HTTP Links Are Used
curl –I http://dbpedia.org/data/Reykjavik
HTTP/1.1 200 OK
Date: Thu, 27 Oct 2016 04:43:28 GMT
Content-Type: application/rdf+xml; charset=UTF-8
Content-Length: 1210
Link:
<http://creativecommons.org/licenses/by-sa/3.0>
; rel=“license",
<http://dbpedia.org/data/Reykjavik>
; rel="alternate"; type="text/n3",
<http://dbpedia.org/resource/Reykjavik>; rel="describes",
<http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/
data/Reykjavik>
; rel="timegate"
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
16
HTTP Link Relation Types
• Registered in IANA registry
• Strings, e.g. license, alternate, describes, timegate
• Requires a formal specification e.g., RFC
• Typically used for common relationships, generically specified
• Provides broad, coarse grained interoperability
• Minted by a community
• URIs e.g., http://xmlns.com/foaf/0.1/primaryTopic
• Requires community agreement
• Can be as specific as desired
• Can provide community-specific, fine grained interoperability
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
17
HTTP Links Are Pretty Neat
• Can uniformly be used for all MIME types
• Accessible via HTTP HEAD (no content transfer):
• Works for large resources and for restricted content
• HTTP Links can be conveyed:
• by-value, in the HTTP Link header
• by-reference, by using a linkset link in the HTTP header that
points to a collection of links (1)
• HTTP Links provide guidance to machine agents’ intent on
accomplishing a specific task
(1) Wilde, E. and Van de Sompel, H (2017)Linkset: A Link Relation Type and Media Types for Link Setshttps://datatracker.ietf.org/doc/draft-wilde-linkset-link-rel/
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
18
HTTP Links Alternative: Links in Resource
Representation
• Can only be done for media types that support inclusion of typed links i.e., using the HTML <link> element in <head>
• Requires HTTP GET (content transfer)
• HTML <link> is accessible to JavaScript
• For HTML pages, use both HTTP Link and HTML <link>
• Links can be conveyed:
• by-value, using HTML <link>
• by-reference, by using an HTML <link> with the linkset
relation type that points to a collection of links
• HTML <link> elements provide guidance to machine agents’ intent
on accomplishing a specific task
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
19
Signposting for Repositories
Proposal:
Use HTTP Links to address some long standing problems regarding
scholarly resources on the web, by interlinking them using
appropriate relation types.
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
20
bibliographicresources
constituentresources
HTTPPID
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
21
Pattern: Identifier
• Problem: It is not possible to determine the associated HTTP PID of
a scholarly object’s constituent resources
• Landing page URIs used for citation (1)
• Annotations do not refer to HTTP PID
• Solution: provide identifier link pointing at the HTTP PID
• Applies to: landing page, all constituent resources
(1) Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016)
Persistent URIs Must Be Used to Be Persistent.
In: WWW2016. http://arxiv.org/1602.09102
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
22
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
23
Use HTTP Link with identifier Relation Type
curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"
HTTP/1.1 200 OK
Content-length: 8424
Content-type: text/html;charset=UTF-8
Link: <https://doi.org/10.1594/PANGAEA.867908>
; rel="identifier"
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
24
Pattern: Publication Boundary
• Problem: It is not possible to determine what the constituent
resources of a scholarly object are
• Preservation and text mining tools require portal-specific
heuristic to find those constituent resources (1)
• No direct path from an HTTP PID to e.g., the PDF
• Solution: provide item/collection links to interlink entry page
and constituent resources; convey MIME types on item links
• Applies to: All constituent resources of a scholarly object
(1) Van de Sompel, H., Rosenthal, D., and Nelson, M.L. (2016) Web Infrastructure to Support e-Journal Preservation (and More)
http://arxiv.org/abs/1605.06154
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
25
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
26
Use HTTP Link with item/collection Relation Type
curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"
HTTP/1.1 200 OK
Content-length: 8424
Content-type: text/html;charset=UTF-8
Link: <https://doi.pangaea.de/10.1594/PANGAEA.867908?format=zip>
; rel="item"
; type="application/zip"
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
27
Pattern: Bibliographic Metadata
• Problem: It is not possible to determine where the bibliographic
resources that describes a scholarly object can be found
• Preservation and reference manager tools require portal-specific
heuristic to find those resources (1)
• Solution: provide describedby/describes links to interlink entry
page and bibliographic metadata resources
• Applies to:
• describedby: HTTP PID, landing page
• describes: bibliographic resources
(1) Van de Sompel, H., Rosenthal, D., and Nelson, M.L. (2016) Web Infrastructure to Support e-Journal Preservation (and More)
http://arxiv.org/abs/1605.06154
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
28
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
29
Use HTTP Link with describedby/describes Relation Type
curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"
HTTP/1.1 200 OK
Content-length: 8424
Content-type: text/html;charset=UTF-8
Link:
<https://doi.pangaea.de/10.1594/PANGAEA.867908?format=citation_ris>
; rel="describedby"
; type="application/x-research-info-systems",
<https://doi.pangaea.de/10.1594/PANGAEA.867908?format=citation_
bibtex>
; rel="describedby"
; type="application/x-bibtex"
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
30
Pattern: Author
• Problem: It is not possible to uniquely determine who authored the
work
• Solution: provide author link to interlink HTTP PID and author-
identifying URI
• Applies to: HTTP PID, landing page, all constituent resources
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
31
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
32
Use HTTP Link with author Relation Type
curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"
HTTP/1.1 200 OK
Content-length: 8424
Content-type: text/html;charset=UTF-8
Link: <http://orcid.org/0000-0003-1291-8524>
; rel="author"
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
33
Take-Aways
• HTTP Links and Relation Types help to:
• Convey the (persistent) identifier of a resource
• Inform about the boundaries of an object
• Point at bibliographic metadata
• Refer to an author-identifying resource
• Increase interoperability of repositories, embrace principles of
the web
• Make repositories more machine-friendly to the benefit of
humans
Signposting for Repositories
@mart1nkle1n, @hvdsomp
OR 2017, 06/29/2017, Brisbane, AUS
Signposting for Repositories
Martin Klein@mart1nkle1n
http://orcid.org/0000-0003-0130-2097
Herbert Van de Sompel@hvdsomp
http://orcid.org/0000-0002-0715-6126
Research Library
Los Alamos National Laboratory
http://signposting.org
Signposting is funded by the Andrew W. Mellon Foundation
Cartoon by Patrick Hochstenbach
top related