metadata ingestion plan presentation

34
europeanasounds.eu Metadata Ingestion Training 23-24 October 2014 NTUA, Athens Metadata Ingestion Plan Targets Reporting progress Andra Patterson Metadata Manager, Europeana Sounds

Upload: europeanasounds

Post on 13-Aug-2015

137 views

Category:

Technology


0 download

TRANSCRIPT

e u r o p e a n a s o u n d s . e u

Metadata Ingestion Training 23-24 October 2014 NTUA, Athens

Metadata Ingestion Plan Targets Reporting progress Andra Patterson Metadata Manager, Europeana Sounds

e u r o p e a n a s o u n d s . e u

Metadata Ingestion Plan

Takes into account:

• 4 main stages of aggregation

• Needs of data providers for scheduling

• Info from Rights and metadata ingestion survey

• Info from emails, phone calls, etc.

• Targets from DoW

Flexible - may need to take into account:

• Changing needs of data providers during project

• Needs of Europeana Ingestion Team

e u r o p e a n a s o u n d s . e u

Aggregation – 4 main stages

Content selection

Metadata preparation

Metadata ingestion

Metadata curation

e u r o p e a n a s o u n d s . e u

Aggregation – Stage 1

Content selection

Select the objects for which you will provide metadata to Europeana Sounds

• According to selection guidelines in D1.1 Content Selection Policy

• According to figures in Table 0, DoW (part B, p.22-27)

Establish the correct rights statements for the objects

• Use Europeana Available Rights Statements

e u r o p e a n a s o u n d s . e u

Aggregation – Stage 2

Metadata preparation

Prepare your metadata and export in .xml or .csv

• Check that mandatory elements are included or can be added

• Check that source metadata is well-formed

• Ensure that digital objects are accessible via links in metadata

• Ensure that objects that can be made available for re-use fit criteria in Europeana Content Re-use Framework • File quality; Rights

e u r o p e a n a s o u n d s . e u

Aggregation – Stage 3

Metadata ingestion

Ingest your metadata records using MINT tool

• MINT

• Web-based tool

• Developed by NTUA

• Used to map, ingest and deliver metadata to Europeana

• Map metadata to schema defined in D1.4 EDM Profile for Sound

e u r o p e a n a s o u n d s . e u

Aggregation – Stage 4

Metadata curation

Enrich your metadata records using MINT tool

• Normalise metadata

• Enrich metadata

• Add controlled vocabulary terms

e u r o p e a n a s o u n d s . e u

Targets Table 0 Underlying Content (Part B, p.22-27) = what we are contracted to achieve

e u r o p e a n a s o u n d s . e u

Targets

Progress measured against Performance Monitoring Table (Part B, p.91)

“Available for re-use” Europeana definition:

PDM, CC0, CC-BY, CC-BY-SA

e u r o p e a n a s o u n d s . e u

Targets

Targets for each “metadata set”

Set 1: October 2014-January 2015 (Milestone 5)

Set 2: February 2015-January 2016 (no formal Milestone)

Set 3: February 2016-July 2016 (Milestone 6)

Milestones say: “Content and metadata ready for ingestion”

e u r o p e a n a s o u n d s . e u

Targets

0

100000

200000

300000

400000

500000

600000

700000

800000

Re-use subset

Audio-related

Audio

Chart showing required (minimum) metadata ingestion progress

e u r o p e a n a s o u n d s . e u

Reporting progress – what to count

• DoW requires us to count digital objects

– Digital objects must be counted the same way as in the DoW

• Audio objects

• Audio-related objects

• Objects “Freely available for re-use”

– These are a subset of the total, not additional items

• Also count metadata records

– Useful to compare what you have prepared for publication with what is actually published on Europeana

e u r o p e a n a s o u n d s . e u

Each line is a metadata record

Counting BL digitised sound

One metadata record usually represents one digital object

e u r o p e a n a s o u n d s . e u

No duplicates, please!

Keep track internally of what you have supplied to Europeana already for this project and for other Europeana projects – no duplicates!

e u r o p e a n a s o u n d s . e u

Each line is a metadata record

Number of digital objects counted for DoW Table 0

Counting BL digitised printed scores

One metadata record often represents many digital objects

e u r o p e a n a s o u n d s . e u

Reporting progress – how to record

• Record statistics in your Google or Excel spreadsheet

– See Europeana Sounds Manual for Data Providers section 3.3.3 for links to Google spreadsheets (will be active next week!)

• Update your spreadsheet by 3rd Friday of each month

• Targets – are based on Table 0, Metadata Ingestion Survey, emails

– are distributed across the 3 metadata sets

– are the minimum required - feel free to do more!

e u r o p e a n a s o u n d s . e u

Sample Google spreadsheet showing targets for BL – edit the orange cells!

e u r o p e a n a s o u n d s . e u

Thank you for listening!

e u r o p e a n a s o u n d s . e u

Metadata Ingestion Training 23-24 October 2014 NTUA, Athens

Metadata Quality Meaningful metadata Rights Controlled vocabularies Andra Patterson Metadata Manager, Europeana Sounds

e u r o p e a n a s o u n d s . e u

Metadata Quality

• The richer the metadata, the better for discovery by users

• Europeana Sounds provides an opportunity for us to enhance our metadata and check quality

• EDM mandatory elements ensure a minimum metadata standard

• Metadata Quality Task Force (end 2013-mid 2014)

– Quality of metadata varies between institutions

– Need meaningful information in fields

e u r o p e a n a s o u n d s . e u

Metadata Quality – Main Issues

• To aid discovery, metadata needs to provide context to the CHO

– Include a meaningful title and/or description

• Metadata needs to be understandable to

– Humans (e.g. rich descriptions, rights information)

– Machines (e.g. UTF-8 coding, xml-lang)

• Metadata needs to be standardised

– EDM-compliant

– Controlled vocabularies (edm:type, ebucore:hasGenre)

e u r o p e a n a s o u n d s . e u

Rights

• Establish the rights of your web resources

– May need to discuss with colleagues

– Use information & resources from WP3

• Important to use the most appropriate rights statement for your web resources

– Tells users what they can or can’t do with an object

– Web resources of Public Domain CHOs should be labelled as Public Domain – discuss any issues about this with Andra Patterson or Lisette Kalshoven

Right! Getting

e u r o p e a n a s o u n d s . e u

Rights – Public Domain Works • Europeana Public Domain Charter

– “Digitisation of Public Domain content does not create new rights over it”

• Europeana Sounds Consortium Agreement

– “… where possible … content which is in the Public Domain … will be made available without any access restriction and will be labelled as being in the Public Domain …”

• Some data providers may encounter issues with this, e.g.

– Commercial re-use considered inappropriate

• Academic, artistic, private OK; some commercial re-use considered inappropriate; sponsorship funds provided according to this (ONB)

– Desire to refinance digitisation activities

• Government funding is basic – charging fees for high quality images contributes to refinancing digitisation (ONB)

• However, non-profit institutions run risk of losing non-profit status by earning too much from commercial users! (ONB)

– Legal

• Case law in UK is inconclusive so far (BL)

e u r o p e a n a s o u n d s . e u

e u r o p e a n a s o u n d s . e u

Rights - EDM edm:ProvidedCHO dc:rights

– Name of rights holder of CHO, or more general rights information

edm:WebResource dc:rights

– Name of rights holder of a particular web resource, or more general rights information

edm:WebResource edm:rights (Strongly recommended)

– Formal rights statement for a particular web resource

– Overrides statement in ore:Aggregation edm:rights (see below)

– Choose from http://pro.europeana.eu/available-rights-statements

ore:Aggregation edm:rights (Mandatory)

– Formal rights statement for a particular web resource without edm:rights (see above)

– Formal rights statement for a group of web resources without their own edm:rights, when these are attached to one CHO

– Choose with care from http://pro.europeana.eu/available-rights-statements

e u r o p e a n a s o u n d s . e u

What is this?

Danish pastry

Wieneråtta

Wienerbrød

Kopenhagener Plunder

Dänischer Plunder

Danish

e u r o p e a n a s o u n d s . e u

Vocabularies

• Enable users to search and navigate across different metadata sets

• Important in Europeana Portal, where different data providers use different vocabularies

• Bring together using linked data where possible

– LC Linked Data Service

– VIAF (Virtual International Authority File)

Controlled

e u r o p e a n a s o u n d s . e u

Controlled Vocabularies – Linked Data

VIAF Virtual International Authority File

e u r o p e a n a s o u n d s . e u

Controlled Vocabularies

• EDM vocabularies

– edm:rights • http://pro.europeana.eu/available-rights-statements

– edm:type • TEXT, VIDEO, SOUND, IMAGE, 3D

• Europeana Sounds new vocabularies

– dcterms:medium • Europeana Carrier Types Vocabulary

– ebucore:hasGenre • Europeana Music Genre/Form Vocabulary • Europeana Non-Music Genre/Form Vocabulary

Shared,

e u r o p e a n a s o u n d s . e u

Europeana Vocabularies – Carrier Types

Europeana Carrier Types Vocabulary

DISMARC dmFormats

RDA Carrier Types

dcterms:medium

e u r o p e a n a s o u n d s . e u

New Europeana Vocabularies – Genre/Form

Europeana Music Genre/Form Vocabulary

Europeana Non-Music (Generic) Genre/Form

Vocabulary

ebucore:hasGenre

DISMARC dmGenre

DBpedia

D1.1 Content Selection

Policy broad categories

Freebase

e u r o p e a n a s o u n d s . e u

Broad Genre/Form Concepts (Mandatory)

Europeana Music Genre/Form Vocabulary

Europeana Non-Music (Generic) Genre/Form

Vocabulary

Broad Genre (Mandatory)

• Music • Spoken word • Radio • Environment

ebucore:hasGenre

e u r o p e a n a s o u n d s . e u

• Europeana Sounds Manual for Data Providers section 4.5 has links to recommended vocabularies

• Genre/Form

• Subjects

• Places

• Carrier types

• Digital formats

• Medium of performance

• Names

• Roles

• Works

More About Controlled Vocabularies

e u r o p e a n a s o u n d s . e u

Thank you for listening!

Image: Friends of Music Society, Greece CC-BY-NC