it19 20140721 linked data personal perspective
Post on 08-Jul-2015
131 Views
Preview:
DESCRIPTION
TRANSCRIPT
Janifer Gatenby
OCLC EMEA
With acknowledgements to Richard Wallis and Anila Angjeli
LINKED DATAA PERSONAL PERSPECTIVE
LINKED DATA
• What is it?
• What does it promise?
• How do we get there?
• What happens when we get there?
WHAT IS IT?
A WAY OF EXPRESSING A LINK
What is it?
• Not really a new way of linking but a new way of expressing a link
It is about using canonical trusted globally
referenceable identifiers for concepts, people,
organisations, locations etc. instead of copying text
strings and losing the connection with the
authoritative sources they came from.
Richard Wallis
MARC21 LINKS
What is it?
• 700 10 $a name $e role $0 authority control number
– (added entry in a MARC record for a name related to a work, not the main
author)
These familiar links reference an authority record in the
same database as a bibliographic record, hence have
no address portion. Linked data extends the linking
range.
EXTENDING THE LINKING RANGE: URI
What is it?
• URI – immutable address as well as an identifier
– http://id.loc.gov/authorities/names/nr89009099
– http://viaf.org/viaf /116774723
– http://isni.org/isni/000000114556841
9 NACO libraries –
LC,
National Agricultural Library,
National Library of Medicine,
British Library,
NL Mexico,
NLNZ,
NL Scotland,
NL South Africa,
NL Wales
• RDF – metadata is expressed in triples
– Data
– Data label (properties)
– Vocabulary from which the label comes (gives context to the label)
EXTENDING THE LINKING RANGE: RDF
What is it?
SPARQL
What is it?
• A database can offer a SPARQL endpoint = can receive RDF queries
– Author [schema] Name [data label] De Groot, Gerard J., 1955 [data]
• “SPARQL allows users to write queries against data that can loosely be called "key-value" data,
more specifically it is data that follows the RDF specification of the W3C. The entire database is
thus a set of "subject-predicate-object" triples.”
• 1.1 Stable release 2013-03-21
– W3C recommendation
http://en.wikipedia.org/wiki/SPARQL
http://www.w3.org/blog/SW/2008/01/15/sparql
_is_a_recommendation/
LINKED DATA PRINCIPLES
What is it?
1. Use URIs as names for things
2. Use HTTP URIs so people can look up those names
3. When someone looks up a URI, provide useful information, using the standards - RDF
4. Include links to other URIs, so that they can discover more
Tim Berners-Lee - 2006
VOCABULARIES
What is it?
• Vocabularies are not schemas, they are lists of defined data labels (concepts)
– Schema.org (Search engines)
– BibFrame (Library community)
– FOAF Friend of a friend
– OWL same as
• Vocabularies can be mixed foaf:name "Jimmy Wales" ;
foaf:mbox <mailto:jwales@bomis.com> ;
foaf:homepage <http://www.jimmywales.com/> ;
foaf:nick "Jimbo" ;
WHAT DOES IT PROMISE?
What does it promise?
• Enriched displays without data maintenance
• Better harvesting and ranking
• because of markup
• and because of links
• Navigation to pages with additional information –
– Example: from VIAF via ISNI to encyclopaedias, rights management societies (digitisation
rights), Bowker – biographies from fly leaves
INTERCONNECTING FRENCH CULTURAL HERITAGE TREASURES ON
THE WEB
What does it promise?
BnF Main catalogue
(MARC)
Digital documents
(DC)
Web pages for
Internet usersBnF Archives and
Manuscripts
catalogue
(EAD) Raw data for machines
Modeling
Matching
Clustering
Alignments
Semantic Web
techniques
Other BnF
resourcesExternal
resources
What does it promise?
BnF persistent ID
Imported
from
Wikipedia
and
integrated in
the page
What does it promise?
Information about the data model (or ontology) at : http://data.bnf.fr/about-en
Data can be downloaded
Existing ones + othersdefined for the specific
needs of the project
BIG DATA AS RDF
• Data is re-usable without a full blown conversion
• Permits 3rd party analysis of big data sets
• Data mining for new information
What does it promise?
HOW DO WE GET THERE?
MAKING THE LINKS
How do we get there?
DNB CultureGraph
– “It’s all about creating
connections”
– DDC to RVK (German
classification) by
comparing search
results
– GND (names) to
German Wikipedia
EXAMPLE: VIAF
How do we get there?
• Ingesting data to compare and create links
• Makes clusters; cluster identifier
• Ingesting preferred to external linking
– Wikipedia, ISNI, WorldCat identities
– More data used for clustering, so more reliable
• VIAFBot for making reciprocal links in Wikipedia / Wikidata
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<rdf:typedf:resource="http://rdvocab.info/uri/schema/FRBRentitiesRDA/Person"/>
<foaf:name>De Groot, Gerard J., 1955-</foaf:name>
<foaf:name>DeGroot, Gerard J., 1955-</foaf:name>
<rdaGr2:dateOfBirth>1955-06-22</rdaGr2:dateOfBirth>
<owl:sameAs rdf:resource="http://data.bnf.fr/ark:/12148/cb12299846b#foaf:Person"/>
<owl:sameAs rdf:resource="http://www.idref.fr/034977651/id"/>
<owl:sameAs rdf:resource="http://d-nb.info/gnd/12422900X"/>
Libraries
Text Rights
Music RightsTrade Sources
Encyclopaedias
Researchers & ProfessionalGranting organisationsProfessional SocietiesArticle databasesTheses databases
cross-domain bridging-domains
Archives and Museums
EXAMPLE: ISNI: 15 MILLION LINKS
How do we get there?
Linked Data: isni.org/isni/
LA TROBE UNIVERSITY LINKS: 3,427
How do we get there?
LA TROBE UNIVERSITY: 1,864 VIAF LINKS
How do we get there?
ISNI – A LINKING IDENTIFIER
How do we get there?
• Identifiers Seal Uniqueness: “n” number
of other elements are necessary for
uniqueness
• Stable identifier; stable metadata:
• assigned where there is confidence in
the quality and completeness of the
metadata to establish uniqueness
• ISNI system + Quality Team (BL & BnF)
Linking erroneous data
propagates errors.
LINKS ARE MADE ONCE – THEN INHERITED
How do we get there?
• URI – immutable address as well as an identifier
– http://id.loc.gov/authorities/names/nr89009099
– http://viaf.org/viaf /116774723
– http://isni-url.oclc.nl/isni/000000114556841
9 NACO libraries –
Library of Congress,
National Agricultural Library,
National Library of Medicine,
British Library,
NL Mexico,
NLNZ,
NL Scotland,
NL South Africa,
NL Wales
WHAT HAPPENS WHEN WE GET THERE?
HOW DOES SEARCHING WORK?
• Search happens mostly in the search engines
• Library catalogue concentrates on:
– Being linked to
– Linking out (navigation)
– Delivery, particularly of the digitised and immediate
What happens when we get there?
HOW DO SEARCH AND LINKED DATA INTERACT?
• Is search really fully delegated to search
engines & larger union catalogues?
What happens when we get there?
SEARCH TYPES
What happens when we get there?
Search type Happening in
Known item Search engines, also in more specific
sources where noise is a problem
Subject search Search engines, also in more specific
sources, to reduce noise and benefit from
more precise searching capabilities
Index browse In catalogues
Follow a link Everywhere . In library catalogues from a
full record display.
The more your catalogue is linked in, the more likely it is
to attract all types of searches
STORE ONLY THE LINKS?
What happens when we get there?
• Data needed
• For making indexes
• For comparisons,
e.g. For de-
duplication
• Data mining
It is about using canonical trusted globally
referenceable identifiers for concepts, people,
organisations, locations etc. instead of copying text
strings and losing the connection with the
authoritative sources they came from.
This doesn’t mean that you only
need the links; you often also
need to ingest the data
Besides data storage no longer the constraint it once was
READ FURTHER
• http://www.slideshare.net/tulipbiru64/the-single-power-of-link-richard-wallis
• http://www.slideshare.net/rjw/linked-data-and-oclc
top related