linking and exploring authority files tel-me-mor/m-cast seminar, prague november 23 rd 2006...
TRANSCRIPT
Linking and Exploring Authority Files
TEL-ME-MOR/M-CAST Seminar,Prague November 23rd 2006
Hans-Jörg Lieder, Berlin State Library
Subject access seminar, Prague 23.11.2006
Overview
• Authority Control / Context Information
• Limitations• LEAF• Objectives• Innovations• LEAF System Architecture• LEAF and other Systems
Subject access seminar, Prague 23.11.2006
Authority Control
• Libraries– Separate data records – authority records –
describing persons, corporate bodies, subjects etc., maintained independantly from bibliographic records (as independent database (e.g. PND) or tables/group of tables within library system
– Authority records can be linked to a variety of resources
– Purpose of a person name authority record: disambiguation of a name, i.e. establishing of a 1:1 relationship between a name and a person
Subject access seminar, Prague 23.11.2006
Context Information
• Archives– Traditionally biographical information
is part of an archival record– Purpose of biographical context
information: providing context for the specific archival record
Subject access seminar, Prague 23.11.2006
Present Limitations
• Access to authority data is limited to some institutions only
• Access to authority data is dependant of employed cataloguing rules and data formats
• Cross-domain sharing of authority data (e.g. between libraries, archives, museums and other ‘memory institutions‘) does not exist
• Public users do not have access to authority data
Subject access seminar, Prague 23.11.2006
Examples
Minimal level library authority record:– Smith, John, 1563-1616 (LOC)
Biographical context information in archives come in all shapes and sizes
Richer library authority record:
Subject access seminar, Prague 23.11.2006
LEAF• European project within the 5th
Framework of the European Commission, Programme “Information Society Technology”
• March 2001 – February 2004• 15 partners in 10 countries
Subject access seminar, Prague 23.11.2006
LEAF Objectives 1
LEAF (Linking and Exploring Authority Files) developed a prototype system through which (internationally) distributed person name authority records are gathered, automatically linked in meaningful ways, made available to a variety of operations and opened up for multiple analysis.
Subject access seminar, Prague 23.11.2006
LEAF Objectives 2
The following steps are included:• new or updated local name authority records
are fetched/harvested by or uploaded to the LEAF system on a regular basis
• all records in the LEAF system are converted into one common exchange format (EAC) and inserted into a central database
• records describing the same person are automatically linked
• all records in the LEAF database are available for search and retrieval
Subject access seminar, Prague 23.11.2006
LEAF Objectives 3
• retrieved search results are stored in a Central Name Authority File
• registered users can annotate records• external systems can query the LEAF
service• LEAF can query external systems• external resources can link to LEAF
records• results retrieved in LEAF can be used as
search arguments in other applications
Subject access seminar, Prague 23.11.2006
LEAF Innovations
• Common exchange format for libraries and archives
• Linking process• Usage impact• Addition of annotations • Integration into a distributed
search service
Subject access seminar, Prague 23.11.2006
Exchange formatEAC, Encoded Archival Context • XML DTD, parallel to EAD (Encoded
Archival Description)• Describes circumstances of creation and
use of records, including the identification of persons, corporate bodies and families, their roles and relationships
• Compatible with MARC family, MAB and archival standard ISAAR(CPF) (Int. Standard Archival Authority Record for Corporate Bodies, Persons, and Families)
Subject access seminar, Prague 23.11.2006
EAC structure
Subject access seminar, Prague 23.11.2006
EAC example (partial)<identity> <pershead rule="aacr2" authorized="UCM" languagecode="spa" scriptcode="latn" ea="100"> <part type="surname">Pi Sunyer</part> <part type="forename">Charles</part> </pershead> <pershead rule="aacr2" languagecode="spa" scriptcode="latn" ea="400"> <part type="surname">Pi i Sunyer</part> <part type="forename">Charles</part> </pershead> <pershead rule="aacr2" languagecode="spa" scriptcode="latn" ea="400"> <part type="surname">Pi Sunyer</part> <part type="forename">Carlos</part> </pershead> <nameadds> <date scope="active" calendar="gregorian" normal="1888/1971" form="rangeclosed" ea="100$d">1888-1971</date> </nameadds> </identity>
Subject access seminar, Prague 23.11.2006
Linking process
• Local records are harvested via ftp, OAI or Z39.50, converted into EAC to form LEAF Authority Records (LARs)
• When LARs are found to describe the same person according to linking rules (based on names, dates, IDs), they are merged into a Shared LEAF Authority Record (SLAR)
• Records and links are regularly updated
Subject access seminar, Prague 23.11.2006
Usage impact
• When a user retrieves a LEAF record, its status is changed to Central Name Authority Record (CNAR)
• Data providers can see which of their records have been used and target the improvements they make to their data
• Records never used may be removed from the central system after an expiry date
Subject access seminar, Prague 23.11.2006
Public annotations• Registered users can make temporary
annotations to (shared) LEAF records, e.g. to suggest a correction
• Data providers are alerted when an annotation is made to one of their records
• Registered users can make persistent annotations to central LEAF records, e.g. to provide a piece of information of general interest
Subject access seminar, Prague 23.11.2006
Private annotations• A private workspace is available to
registered users• They can save central LEAF
records into it• They can make private annotations
to the saved records• They receive a warning if a central
LEAF record they saved is modified
Offlinecomponents
REPOSITORY
Database Access unit
LEAFdatabase
Online components
ACQUISITION
Harvesting unit
LocalData Base
LocalData Base
LocalData Base
Data import unit
Conversion unit
Linkingunit
Presentation unit
User Workspace
unit
Export unit
Maintenance
Suite
External Z39.50 system
ExternalServices
UserInterface
Logging unit
Admin. unit
Registration unit
Annotation unit
Search unit
MALVINEUser Interface
MALVINESearch Engine
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Possible problem with a distributed search
100 Louis $b XIV $c Roi de France100 Ludwig $b 14 $c franz. König
400 Louis $b XIV $c Roi de France
Search argument: Ludwig
returns records returns no records
Subject access seminar, Prague 23.11.2006
The Issue of NamesTurgenev, Ivan S. Turgenjew, Iwan S. Turgenev, Ivan Sergeevich Turgenev, Ivan Sergeevic Turgenev, Ivan Turgenjew, Iwan Turgenjew, Iwan Sergejewitsch Turgenjew, I. S. Turgenjew, I. Turgenew, Iwan Turgenew, Iwan S. Turgenew, Iwan Sergejewitsch Turgenjew, Iwan Sergejewic
Turjenjew, Iwan S. Turgenjeff, Iwan Turgenjeff, Iwan S. Turgenjeff, I. S. Turgeniew, Iwan S. Turgeniew, Iwan Turgenjev, Iwan Turghenew, Iwan Turgenew, Johann von Turgenew, I. S. Turgeneff, Johann von Tourguénev, Ivan Turgeneff, Sergei Turgénjew, Iwan
Sergejewitsch
Subject access seminar, Prague 23.11.2006
Integration
LEAF functionalities were tested in combination with the MALVINE service (i.e. a distributed search service for manuscript descriptions; see: www.malvine.org).
How does it work?
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Subject access seminar, Prague 23.11.2006
Next steps / Failed Plans
• After end of project (February 2004), consortium, plans to scale up to a full LEAF service.
• But, so far no follow-up, and the project website has disappeared.
• However prototype remains: http://iiss033.joanneum.at:8100/Leaf2/index
• No integration yet with Malvine.• Current deliberations: Kalliope,
MALVINE.
Subject access seminar, Prague 23.11.2006
Follow-up?
• Reactivate?• Study add-ons to VIAF? E.g.
annotations• Any experiences with LEAF to
share?
Depends on the future of TEL!
Subject access seminar, Prague 23.11.2006
Contacts
• Technical questions:[email protected]
• All other questions:[email protected]