metadata mapping & crosswalks
TRANSCRIPT
Metadata Mapping&
Metadata Crosswalks
Nikos Palavitsinis, PhD
Alternative Title”the story of combining
Ariadne’s thread with the Gordian Knot”
What are crosswalks?
• Crosswalks show people where to put the data from one scheme into a different scheme. They are often used by libraries, archives, museums, and other cultural institutions to translate data to or from MARC, Dublin Core, TEI, and other metadata schemes.
source
One-way only
The process of translating from one schema to another is called metadata mapping or field mapping [source]
Crosswalk from MARC to DC Crosswalk from DC to MARC
Mapping Problems
• Element A in Scheme A contains X values that need to be split up into Element 1 and Element 2 of Scheme B
• Element A in Scheme A can take more that one values (multiplicity of n) whereas the equivalent Element 2 in Scheme B, takes all these values in a single field
Mapping Problems
• Different data formats across schemas (use of names, other conventions, etc.)
• Element A in Scheme A is indexed but the equivalent element in the other scheme is not
• Scheme A uses a different controlled vocabulary for the same Element than Scheme B
“The more metadata experience we have, the more it becomes clear that metadata perfection is not attainable, and anyone who attempts it will be
sorely disappointed.
When metadata is crosswalked between two or more unrelated sources, there will be data elements that
cannot be reconciled in an ideal manner. The key to a successful metadata crosswalk is intelligent
flexibility. It is essential to focus on the important goals and be willing to compromise in order to reach
a practical conclusion…“"Metadata in Practice" Diane I. Hillmann and Elaine L. Westbrooks, eds., American Library Association, Chicago, 2004, p. 91.
Automated?
• Metadata Crosswalks can be automated, but due to the complexity of metadata standards and the extent of customization taking place, only few general purpose automated processes exist for crosswalks
Mapping between formats
• Excellent resource by Michael Day of UKOLN– http://www.ukoln.ac.uk/metadata/interoperability/
Source
Metadata Element Set
• Two key components– Semantics: Definitions of the meanings of the
elements – Content: Declarations or instructions (or rules) of
what and how values should be assigned to elements
Why map metadata?
• “Interoperability is the ability of multiple systems with different hardware and software platforms, data structures, and interfaces to exchange data with minimal loss of content and functionality”
NISO (National Information Standards Organization). (2004). Understanding metadata. Bethesda, MD: NISO Press. Available: <http://www.niso.org/standards/resources/UnderstandingMetadata.pdf>.
Interoperability
…on a schema levelfocusing on the elements of the schemas, being independent of any applications. Derived element sets, encoded schemas, crosswalks, application profiles, and element registries
…on a record levelfocusing on integrating metadata records through the mapping of the elements according to the semantic meanings of these elements. Converted records and new records resulting from combining values of existing records
Interoperability
…on a repository levelfocusing on mapping value strings associated with particular elements (terms associated with subject or format elements). The results enable cross-collection searching
Source: http://www.dlib.org/dlib/june06/chan/06chan.html
Interoperability on the schema level
• This is achieved through: – Derivation• Using elements from existing schemas or standards, as
they are– Application Profiling• Localizing and optimizing schemata for specific contexts
– Metadata Crosswalks• mapping elements, semantics, and syntax from one
metadata scheme to those of another
Interoperability on the schema level
• This is achieved through: – Switching Across
• When trying to crosswalk among more schemas, using a central one as a switch and crosswalking all to this one, is easier
– Metadata Framework• Either developing it based on existing schemas, or establishing
it before the development of schemas and application profiles– Metadata Registry
• Offering a centralized access point to existing schemas, to facilitate the development of new ones and “foster” interoperability
Crosswalking Approaches
• Absolute crosswalking– You only match the elements that are 100%
equivalent and you ignore the rest• Useful when mapping from a simpler to a more complex
schema
• Relative crosswalking– You map all elements in a source schema to at least
one element of a target schema• Useful when mapping from a complex to a simpler
schema
Three Meanings of Interoperability
• Semantic– Semantic mapping is the process of analyzing the
definitions of the elements or fields to determine whether they have the same or similar meanings
• Cultural– presence of data models or wrappers that specify the
semantic schema being used• Syntactic (technical)– the ability to communicate, transport, store, and
represent metadata and other types of information between and among different systems and schemas
Source
Examples of Metadata Ingestion
Bitter Harvest: Problems & Suggested Solutions for
OAI-PMH Data & Service Providers
Fill Partner Request Form
Process Partner Request Form and decide on viable aggregation route
Send Data Exchange
Agreement (DEA)
Inform aggregator and
liaise with potential data
provider
Sign DEA and send to Europeana (data providers or aggregators have to sign
with aggregator)
Send Data Contribution Form
Fill Data Contribution Form and send to Europeana
Process Data Contribution Form to enable first delivery of data
Delivery of data via OAI-PMH or FTP sample or full datasets
(new data providers)
Feedback on metadata structure, mandatory elements, rights statements
Delivery of ingest ready data: full datasets (all data providers)
Feedback taken into account Check data
Feedback on metadata structure,
mandatory elements, rights
statements
Ingestion of datasets fully compliant to
publication policy
Publication of the submitted datasets in Europeana
Action for data provider or aggregator
Action for Europeana
Before 5th of a month
Before 15th of a month
Before 21st of a month
Between 21st and 30th
of a month
Between 10th and 20th of following
month
Source: Europeana_Sounds
Metadata Operations
• Metadata Harvesting– The process of collecting metadata descriptions of records
in an archive so that services can be built using metadata from many archives [source]
• Metadata Validation– The process of checking the structure of a metadata record
to define whether or not the record complies to a predefined set of criteria
• Metadata Ingestion– The process of bringing metadata records (and/or content),
into your system [source]
– i.e. You ingest metadata through harvesting [source]
Metadata Operations
• Metadata Transformation– Converting a set of metadata values from the format of a source
system into the format of a destination system [source]
• Metadata Enrichment– The process of adding metadata to an existing metadata record,
thus creating a new record, with added-value operations • Metadata Publishing
– The process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes [source]
Step 1
Harvesting
You harvest the metadata through OAI-PMH in an “intermediate” system
Step 2
Harvesting
Ingestion
The metadata are ingested into the target repository or any other intermediate system
Step 3
Harvesting
IngestionMetadata elements are mapped to the metadata schema of the
receiving repository
Mapping
Step 4
Harvesting
Ingestion
Mapping
Validation
You pass the metadata through a mechanism that checks their integrity in reference to a pre-
defined standard/schema
Step 5
Harvesting
Ingestion
Mapping
Validation
Transformation
Metadata are subjected to the necessary transformations
identified by the validation step
Step 6
Harvesting
Ingestion
Mapping
Validation
Transformation
EnrichmentIf necessary, metadata may be
enriched further, adding value or changing them altogether
Transformation & Enrichment
Step 7… … …Step 1.223.124
Harvesting
Ingestion
Mapping
Validation
Transformation
EnrichmentPublishing
Metadata are published on the target repository and are offered also through an OAI-PMH target
And round it goes!
Reading Material
Other Sources/Projects/Initiatives: • http://www.slideshare.net/RoldanBasilio/metadata-mapping-61747115• http://pro.carare.eu/doku.php?id=support:metadata-mapping • http://old.carare.eu/eng/Support/About-metadata-mapping • https://en.wikipedia.org/wiki/Data_mapping • http://www.oclc.org/research/themes/data-science/schematrans.html • https://indico.cern.ch/event/103325/contributions/1300399/attachments/11668/17064/OAI7_UNSW.pdf • http://www.slideshare.net/locloud/the-mint-mapping-tool-and-the-more-aggregator • http://www.slideshare.net/Europeana_Sounds/aggregation-workflow
Metadata Mapping&
Metadata Crosswalks
Nikos Palavitsinis, PhD
Alternative Title”the story of combining
Ariadne’s thread with the Gordian Knot”