signposting for repositories

Post on 22-Jan-2018

400 Views

Category:

Internet

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

Signposting for Repositories

Martin Klein@mart1nkle1n

http://orcid.org/0000-0003-0130-2097

Herbert Van de Sompel@hvdsomp

http://orcid.org/0000-0002-0715-6126

Research Library

Los Alamos National Laboratory

http://signposting.org

Signposting is funded by the Andrew W. Mellon Foundation

Cartoon by Patrick Hochstenbach

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

2

10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

3

10.1594/PANGAEA.867908

http://dx.doi.org/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

4https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

5https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

6https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

7https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

8https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

9https://doi.pangaea.de/10.1594/PANGAEA.867908

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

10

Problems

• Humans can easily navigate such links i.e.,

• Copy the DOI and resolve it via dx.doi.org

• Determine where the bibliographic resources are

• Interpret the download link for the dataset ZIP

• Search for authors’ names on the web

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

11

Problems

• Humans can easily navigate such links i.e.,

• Copy the DOI and resolve it via dx.doi.org

• Determine where the bibliographic resources are

• Interpret the download link for the dataset ZIP

• Search for authors’ names on the web

• Machines can’t do any of this!

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

12

HTTP Links

Mark Nottingham (2010) RFC5988: Web Linking.

https://tools.ietf.org/rfc/rfc5988.txt

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

13

HTTP Links

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

14

HTTP Links

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

15

HTTP Links Are Used

curl –I http://dbpedia.org/data/Reykjavik

HTTP/1.1 200 OK

Date: Thu, 27 Oct 2016 04:43:28 GMT

Content-Type: application/rdf+xml; charset=UTF-8

Content-Length: 1210

Link:

<http://creativecommons.org/licenses/by-sa/3.0>

; rel=“license",

<http://dbpedia.org/data/Reykjavik>

; rel="alternate"; type="text/n3",

<http://dbpedia.org/resource/Reykjavik>; rel="describes",

<http://mementoarchive.lanl.gov/dbpedia/timegate/http://dbpedia.org/

data/Reykjavik>

; rel="timegate"

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

16

HTTP Link Relation Types

• Registered in IANA registry

• Strings, e.g. license, alternate, describes, timegate

• Requires a formal specification e.g., RFC

• Typically used for common relationships, generically specified

• Provides broad, coarse grained interoperability

• Minted by a community

• URIs e.g., http://xmlns.com/foaf/0.1/primaryTopic

• Requires community agreement

• Can be as specific as desired

• Can provide community-specific, fine grained interoperability

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

17

HTTP Links Are Pretty Neat

• Can uniformly be used for all MIME types

• Accessible via HTTP HEAD (no content transfer):

• Works for large resources and for restricted content

• HTTP Links can be conveyed:

• by-value, in the HTTP Link header

• by-reference, by using a linkset link in the HTTP header that

points to a collection of links (1)

• HTTP Links provide guidance to machine agents’ intent on

accomplishing a specific task

(1) Wilde, E. and Van de Sompel, H (2017)Linkset: A Link Relation Type and Media Types for Link Setshttps://datatracker.ietf.org/doc/draft-wilde-linkset-link-rel/

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

18

HTTP Links Alternative: Links in Resource

Representation

• Can only be done for media types that support inclusion of typed links i.e., using the HTML <link> element in <head>

• Requires HTTP GET (content transfer)

• HTML <link> is accessible to JavaScript

• For HTML pages, use both HTTP Link and HTML <link>

• Links can be conveyed:

• by-value, using HTML <link>

• by-reference, by using an HTML <link> with the linkset

relation type that points to a collection of links

• HTML <link> elements provide guidance to machine agents’ intent

on accomplishing a specific task

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

19

Signposting for Repositories

Proposal:

Use HTTP Links to address some long standing problems regarding

scholarly resources on the web, by interlinking them using

appropriate relation types.

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

20

bibliographicresources

constituentresources

HTTPPID

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

21

Pattern: Identifier

• Problem: It is not possible to determine the associated HTTP PID of

a scholarly object’s constituent resources

• Landing page URIs used for citation (1)

• Annotations do not refer to HTTP PID

• Solution: provide identifier link pointing at the HTTP PID

• Applies to: landing page, all constituent resources

(1) Herbert Van de Sompel, Martin Klein, and Shawn Jones (2016)

Persistent URIs Must Be Used to Be Persistent.

In: WWW2016. http://arxiv.org/1602.09102

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

22

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

23

Use HTTP Link with identifier Relation Type

curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"

HTTP/1.1 200 OK

Content-length: 8424

Content-type: text/html;charset=UTF-8

Link: <https://doi.org/10.1594/PANGAEA.867908>

; rel="identifier"

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

24

Pattern: Publication Boundary

• Problem: It is not possible to determine what the constituent

resources of a scholarly object are

• Preservation and text mining tools require portal-specific

heuristic to find those constituent resources (1)

• No direct path from an HTTP PID to e.g., the PDF

• Solution: provide item/collection links to interlink entry page

and constituent resources; convey MIME types on item links

• Applies to: All constituent resources of a scholarly object

(1) Van de Sompel, H., Rosenthal, D., and Nelson, M.L. (2016) Web Infrastructure to Support e-Journal Preservation (and More)

http://arxiv.org/abs/1605.06154

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

25

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

26

Use HTTP Link with item/collection Relation Type

curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"

HTTP/1.1 200 OK

Content-length: 8424

Content-type: text/html;charset=UTF-8

Link: <https://doi.pangaea.de/10.1594/PANGAEA.867908?format=zip>

; rel="item"

; type="application/zip"

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

27

Pattern: Bibliographic Metadata

• Problem: It is not possible to determine where the bibliographic

resources that describes a scholarly object can be found

• Preservation and reference manager tools require portal-specific

heuristic to find those resources (1)

• Solution: provide describedby/describes links to interlink entry

page and bibliographic metadata resources

• Applies to:

• describedby: HTTP PID, landing page

• describes: bibliographic resources

(1) Van de Sompel, H., Rosenthal, D., and Nelson, M.L. (2016) Web Infrastructure to Support e-Journal Preservation (and More)

http://arxiv.org/abs/1605.06154

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

28

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

29

Use HTTP Link with describedby/describes Relation Type

curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"

HTTP/1.1 200 OK

Content-length: 8424

Content-type: text/html;charset=UTF-8

Link:

<https://doi.pangaea.de/10.1594/PANGAEA.867908?format=citation_ris>

; rel="describedby"

; type="application/x-research-info-systems",

<https://doi.pangaea.de/10.1594/PANGAEA.867908?format=citation_

bibtex>

; rel="describedby"

; type="application/x-bibtex"

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

30

Pattern: Author

• Problem: It is not possible to uniquely determine who authored the

work

• Solution: provide author link to interlink HTTP PID and author-

identifying URI

• Applies to: HTTP PID, landing page, all constituent resources

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

31

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

32

Use HTTP Link with author Relation Type

curl -I "https://doi.pangaea.de/10.1594/PANGAEA.867908"

HTTP/1.1 200 OK

Content-length: 8424

Content-type: text/html;charset=UTF-8

Link: <http://orcid.org/0000-0003-1291-8524>

; rel="author"

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

33

Take-Aways

• HTTP Links and Relation Types help to:

• Convey the (persistent) identifier of a resource

• Inform about the boundaries of an object

• Point at bibliographic metadata

• Refer to an author-identifying resource

• Increase interoperability of repositories, embrace principles of

the web

• Make repositories more machine-friendly to the benefit of

humans

Signposting for Repositories

@mart1nkle1n, @hvdsomp

OR 2017, 06/29/2017, Brisbane, AUS

Signposting for Repositories

Martin Klein@mart1nkle1n

http://orcid.org/0000-0003-0130-2097

Herbert Van de Sompel@hvdsomp

http://orcid.org/0000-0002-0715-6126

Research Library

Los Alamos National Laboratory

http://signposting.org

Signposting is funded by the Andrew W. Mellon Foundation

Cartoon by Patrick Hochstenbach

top related