www.renardus.org follow the fox to renardus: an academic subject gateway service for europe...
Post on 29-Dec-2015
218 Views
Preview:
TRANSCRIPT
enardus
www.renardus.org
Follow the Fox to Renardus: an Academic Subject Gateway Service for Europe
Cross-browsing and Cross-searching in a Distributed Network of Subject Gateways:
Architecture, Data Model and ClassificationDr. Heike Neuroth & Traugott Koch
State Library of Lower Saxony and the University Library of Göttingen, Germany neuroth@mail.sub.uni-goettingen.de
NetLab, Lund University Library Development Department, SwedenTraugott.Koch@ub2.lu.se
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Content
Renardus (aim, partners, etc.) Subject Gateway (definition, elements) Renardus Application Profile (working steps, metadata
core set, data model, etc.) Renardus Collection Level Description Renardus Technical Approach DDC Mapping for Cross-Browsing (methods, mapping
relationships etc.) Outlook
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
What is Renardus?
EU-funded project: EC: 1,7 Mio EURO, including non costs: 2,3 Mio EURO
1 January 2000 - 30 June 2002
under the “Information Society Technologies” (IST-1999-10562) 'Promoting a User-friendly Information Society‘, a major theme of the European Union's 5th Framework Programme
Partners drawn from 7 countries: Project Management: National Library Den Haag (NL) Denmark, Finland, Sweden, France, United Kingdom,
Germany
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Objectives
to provide access to distributed quality-controlled subject gateways (high quality metadata collections) across Europe via one single interface: cross-search cross-browse
and to develop, define: metadata solutions Renardus Application Profile, Renardus Namespaces,
Renardus Collection Level Description technical solutions organizational/business models
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Member Subject Gateways
DAINet: German Agricultural Information System Document Server DEPOSIT: Deposit of German Online
Dissertations DutchESS: Dutch Electronic Subject Service EELS: Engineering Electronic Library, Sweden FVL: The Finnish Virtual Library NOVAGate: Libraries of Nordic Agricultural & Veterinary Univ. SSG-FI: MathGuide, Geo-Guide, History Guide, Anglistik Guide RDN hubs: Resource Discovery Network (EEVL, SOSIG, OMNI, ...)
Danish Electronic Research Library (future partner) Les Signets: Collection of Internet Resources (future partner)
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Subject Gateway
”Quality-controlled subject gateways are Internet-services which
apply a rich set of quality measures to support systematic resource
discovery. Considerable manual effort is used to secure a selection
of resources which meet quality criteria and to display a rich
description of these resources with standards-based metadata.
Regular checking and updating ensure good collection management.
A main goal is to provide a high quality of subject access through
indexing resources using controlled vocabularies and by offering a
deep classification structure for advanced searching and browsing.”
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Subject Gateway cont.
Elements: creation: manual/intellectual, experts etc. selection and collection development: policy, selection criteria
etc. collection management: maintenance of collection etc. resource description/metadata: rich set of metadata, formalized
content description etc. subject classification/subject access: controlled vocabularies etc. standards: allow interoperability etc. value-adding features: display, usage features etc.
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Working Steps - General
selection of necessary/meaningful elements: for a service like Renardus: „Meta-Subject Gateway“,
European service (multilingual access, search, browse) for search, filter, sort, and display options for browse, subject access
selection of common metadata format (exchange format): Dublin Core Metadata Element Set v1.1 Dublin Core Qualifiers others home-grown
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Working Steps - Analysis
first survey of partners‘ metadata format and detailed descripion of each subject gateway
GENERAL name of SG, acronym, responsible organization, source of
funding, time for record creation, general description etc. COLLECTION/SELECTION
target user group, common primary language of target audience, collection scope, geographical and language coverage, selection criteria, granularity, resource types, resource formats etc.
CONTENT - METADATA metadata scheme, metadata set, crosswalks, interoperability,
cataloging rules, authority files etc.
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Working Steps - Analysis cont.
CONTENT - OTHERS metadata browsable, searchable language(s) of descriptions, thesauri, interface, translation
support etc., keywords, classification systems, etc. INDEX TYPE/TECHNICAL NOTES
search engine, indexing system, structure of data storage etc. INTELLECTUAL PROPERTY RIGHTS (IPR)
copyright, branding VARIOUS
(quality) control, link checking, record checking/update etc. backlinks of the gateway, statistical analysis of log files etc.
etc.
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
First Results
definition of 8 metadata elements without detailed semantics, syntax based on Dublin Core:
DC.Title DC.Creator DC.Description DC.Subject DC.Identifier DC.Language DC.Type Country
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus Data Model
detailed investigations of each element about: semantics and syntax of each element qualifiers (refinements, encoding schemes) cataloging rules (creator, description, keywords) namespace repeatability of each element form of obligation (mandatory, strongly recommended, optional) language qualifier (for title, description, subject)
and: administrative elements future elements (rights, publisher), additional elements (format, etc.) common browsing structure via classification system (home-grown,
reuse of an existing system, which one)
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus Application Profile
Renardus Application Profile based on four namespaces, to be encoded in RDF/XML: Dublin Core Namespace: [DCMES version 1.1] Dublin Core
Metadata Element Set, Version 1.1: Reference Description
Dublin Core Qualifiers Namespace: [DCMES Qualifiers (2000-07-11)] Dublin Core Qualifiers
Renardus Namespace: [RMES version 0.1, 2001-04-30] Renardus Metadata Element Set
Renardus Namespace Qualifiers: [RMES Qualifiers version 0.1, 2001-04-30] Renardus Metadata Element Set
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus AP cont.
“content metadata”Title and Title.Alternative
title: DCMES: mandatory, not repeatable, language tag
title.alternative: DCMES Qualifiers: optional, repeatable, language tag
Creator
DCMES: strongly recommended, repeatable
RMES Qualifiers (LastName, FirstName): strongly recommended, repeatable
Description
DCMES: mandatory in text version, repeatable, language tag
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus AP cont.
Subject
DCMES: mandatory, repeatable, language tag
DCMES Qualifiers: strongly recommended, repeatable, language tag
RMES Qualifiers (all other encoding schemes): mandatory, repeatable, language tag
RMES Qualifiers (Ren-DDC): mandatory, repeatable
Identifier
DCMES Qualifiers: mandatory, repeatable (probably in the pilot system)
RMES Qualifiers: “Operational System” mit Qualifiers “Archive”, “Mirror” ...
Language
DCMES Qualifiers: strongly recommended, repeatable
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus AP cont.
Type
DCMES: strongly recommended, repeatable
DCMES Qualifiers (DCT1): strongly recommended, repeatable
DCMES Qualifiers (DCT2): “Operational System”
Country
RMES Qualifiers: strongly recommended, not repeatable
“administrative metadata”Full Record URL
RMES Qualifiers: strongly recommended, not repeatable
SBIG ID
RMES Qualifiers: mandatory, not repeatable
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus CLD Schema
Collection Level Description: simple description of collections, locations and related people or organizations
in Renardus: to provide information about participating Subject Gateways: users chose Subject Gateways for thematic search (semi-
automatic selection for subject)
well-structured background information (human and machine readable)
promotion
registry of Subject Gateways
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus CLD Format
Format: based on RSLP Collection Description (UKOLN):
Dublin Core metadata elements (e.g. DC.Title, DC.Description, DC.Subject)
RSLP metadata elements (cld.country)
Renardus specific metadata elements (e.g. rencld:acronym, rencld:subjectNotation, rencld:resourceLanguage etc.)
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus CLD Tool
WWW based form RDF, RDF/XML, and text encoding file is saved locally, each partner is able to update his
description at every time Renardus broker gathers all Subject Gateway
descriptions
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus Technical Approach
PREPARATION investigation:
of available standards and technologies of functional and user requirements of service provider requirements
formulation of use cases in UML development of data model
data model choosing architecture (decentralized vs. centralized)
architectural diagram search/retrieval protocol common profile (map data model to the protocol Z39.50)
Z39.50 profile, Bath compliant
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Renardus Technical Approach cont.
IMPLEMENTATION data normalization
encoding RDF/XML (RDF normalizing toolkit) classification mapping (mapping tool adapted from CARMENx) CLDs (CLD tool adapted from RSLP)
creation of participants Renardus servers (Z39.50, Z'mbol) implementation of broker software and functionality
cross-searching (Zebril and modified EUROPAGATE simultaneous gateway)
cross-browsing (browsing tool, SQL) user interface implementation (with use cases)
screen layout (Zebril and HTML, Javascript)
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
DDC Mapping for Cross-Browsing
why subject cross-browsing and classification? why switching language?
browsing/mapping from DDC to the local systems/browsing structures
why DDC? comparison to alternatives research license, allowed changes
analysis of partners classification systems types, adaptions, number of levels and classes, subject
overlap
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
DDC Mapping for Cross-Browsing cont.
mapping approaches and issues mapping methods
mapping between classes, not between individual resources
priorities: e.g. only well used classes are mapped
recommendations for local improvements
mapping relationships fully equivalent, narrower and broader equivalent, major and
minor overlap
reuse for retrieval result clustering
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
DDC Mapping for Cross-Browsing cont.
technical solution sources: local classifications, CORC Web Dewey mapping tool adapted from CARMENx (MySQL, PHP, Javascript) syntax of the mapping information creation of the browsing pages
usage of the DDC mapping in Renardus „browse and jump“ why not virtual browsing? DDC classification search (in advanced search) user interface solutions
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
DDC Mapping for Cross-Browsing cont.
future recommendations for subject access efforts in gateways and
brokers multilingual access to the DDC top-levels automatic mapping (and classification) as support owners should take over for sustainable mapping
documentation DDC mapping report (D7.4) practical mapping guidelines (D7.4) paper at IFLA Satellite Conf., August 2001
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Outlook
June 2001: Public Deliverable WP 6, D6.5 Renardus Application Profile Renardus Namespaces Renardus Collection Level Description DDC Mapping
June 2001: Beta-Version of Renardus broker first DDC mapping results first evaluations of broker will start
November 2001 Renardus Workshop for future participating Subject Gateways
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
URLs & References
Renardus http://www.renardus.org SUB Renardus - http://renardus.sub.uni-goettingen.de/ (also with D7.4) News Digest SIGN-UP Form - http://www.renardus.org/news/sign-up.html Evaluation of existing data models (D6.1) -
http://www.renardus.org/deliverables/d6_1/docframe.htm DCMI Dublin Core Metadata Initiative - http://www.dublincore.org/
Dublin Core Metadata Element Set, Version 1.1: Reference Description - http://www.dublincore.org/documents/dces/
Dublin Core Qualifiers - http://www.dublincore.org/documents/dcmes-qualifiers/
DCMI Agents Working Group - http://www.dublincore.org/groups/agents/ DCMI Type Working Group - http://www.dublincore.org/groups/type/
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
URLs & References
RSLP Collection Description - http://www.ukoln.ac.uk/metadata/rslp/ CLD Collection Level Description - http://ukoln.ac.uk/metadata/cld/ RSLP Collection Description Tool -
http://www.ukoln.ac.uk/metadata/rslp/tool/
Subject Gateways (Traugott Koch): Online Information Review, Vol. 24, Number 1, 2000
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Cross-Search
basic index: Title, Description, Subject
field search: Title
Creator (in DC Simple and later on in RMES Qualifiers)
Description
DDC Captions (also cross-browsable!)
Subject (in future: several encoding schemes for keyword and classification systems of partners)
Type
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Filter Options
Type DCMI Type 1 (mapping of partners‘ document types to Dublin
Core Type 1)
in future also meaningful: mapping to Sub Type List of DCMI?
Probably no Renardus specific type list
Language (of resources and languages of metadata = Language Tag)
Country
ELAG 2001, Prague 6-8 June 2001Neuroth & Koch
Sorting
Title (alphabetic sorting)
in future: Type, Language, Country? (central architecture)
Subject: Ren-DDC Classification mapping relation (fully equivalent, narrower equivalent, broader
equivalent, major overlap, minor overlap)
in discussion: Subject - Keywords: sorting after subject indexing group:
controlled vocabulary versus free keywords, but problematic!
top related