metadata: an overview katie dunn technology & metadata librarian rensselaer polytechnic...

74
Metadata: An Overview Katie Dunn Technology & Metadata Librarian Rensselaer Polytechnic Institute Slides, links, handout: tinyurl.com/IIST602metadata

Upload: noel-warren

Post on 27-Dec-2015

219 views

Category:

Documents


2 download

TRANSCRIPT

Metadata: An Overview

Katie DunnTechnology & Metadata LibrarianRensselaer Polytechnic Institute

Slides, links, handout: tinyurl.com/IIST602metadata

Metadata: An overview

• What is metadata?• Different standards for different types of

resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

Metadata: An overview

• What is metadata?• Different standards for different types of

resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

What is metadata?

Definitions vary!

• Information about something– Usually digital resources

• Structured– Fields, tags, etc.– Simple or complex

Metadata: An overview

• What is metadata?• Different metadata standards for different

types of resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

Different metadata standards for different types of resources and disciplines

• General– Dublin Core

• Specific– EAD – VRA Core – ISO 19115: Geographic information – Metadata

• (These are all descriptive metadata)

Dublin Core: a general purpose descriptive metadata scheme

Originally: 15 elements

TitleCreatorSubjectDescriptionPublisherContributorDate

TypeFormatIdentifierSourceLanguageRelationCoverageRights

http://dublincore.org/documents/2012/06/14/dcmi-terms/

Dublin Core: a general purpose descriptive metadata scheme

Title- Alternative TitleCreatorSubjectDescription- Abstract- Table of ContentsPublisherContributorDate- Date Available- Date Created- Date Accepted- Date Copyrighted- Date Submitted- Date Issued- Date Valid- Date Modified

TypeFormat- Extent- MediumIdentifier- Bibliographic CitationSourceLanguageRelation- Conforms To- Has Format / Is Format Of- Has Part / Is Part Of- Has Version / Is Version Of- References / Is Referenced By- Replaces / Is Replaced By- Requires / Is Required By

Coverage- Spatial Coverage- Temporal CoverageRights- Access Rights- LicenseRights HolderAccrual MethodAccrual PeriodicityAccrual PolicyAudience- Audience Education Level- MediatorInstructional MethodProvenance

Now: 55 terms

http://dublincore.org/documents/2012/06/14/dcmi-terms/

Dublin Core: a general purpose descriptive metadata scheme

New terms refining DateDate A point or period of time associated with an event in

the lifecycle of the resource.- Date Available Date (often a range) that the resource became or will

become available.

- Date Created- Date Accepted- Date Copyrighted- Date Submitted- Date Issued- Date Valid- Date Modified

http://dublincore.org/documents/2012/06/14/dcmi-terms/

Dublin Core: a general purpose descriptive metadata scheme

New terms refining Format

Format The file format, physical medium, or dimensions of the resource.

- Extent The size or duration of the resource.

- Medium The material or physical carrier of the resource.

http://dublincore.org/documents/2012/06/14/dcmi-terms/

Dublin Core: a general purpose descriptive metadata scheme

New terms that didn’t refine any existing termsRights HolderAccrual MethodAccrual PeriodicityAccrual PolicyAudience- Audience Education Level- MediatorInstructional MethodProvenance

http://dublincore.org/documents/2012/06/14/dcmi-terms/

Exercise: Autumn – On the Hudson River

tinyurl.com/autumn-on-the-hudson

Exercise: Autumn – On the Hudson River

tinyurl.com/autumn-on-the-hudson• Create a Dublin Core description for this image

using any of the 55 Dublin Core terms– Term definitions here: tinyurl.com/dcmi-terms

– Elements may be repeated.– You don’t need to use all the elements.

Dublin Core record for Autumn – On the Hudson•

VRA Core description of this resource

• tinyurl.com/vra-example • (http://www.vraweb.org/projects/vracore4/example026.html)

Metadata: An overview

• What is metadata?• Different metadata standards for different

types of resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

Different types of metadata

• Descriptive • Structural• Administrative

Different types of metadata

• Descriptive• Structural: – Gathers parts of a resource and its different types

of metadata– METS: A metadata wrapper

• Administrative

Different types of metadata

• Descriptive• Structural• Administrative: Data management– Preservation (Ex. PREMIS)– Technical– Rights– Possibly Structural (depending on how you think about it)

Seeing Standards: A Visualization of the Metadata Universe

• http://www.lib.unc.edu/users/jlriley/metadatamap/

• Groups standards by:– Domain: what is described – Community: who uses it– Purpose: type of metadata (descriptive, preservation, etc.)– Function: what does the standard specify?

Metadata: An overview

• What is metadata?• Different metadata standards for different

types of resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

How is metadata implemented?

• Standards and schemas• Data formats (most often XML, MARC)

• Crosswalks• Harvesting• Best practices• Linked data

XML

Basic structure of XML:

<book> <title>The very hungry caterpillar</title> <author>Eric Carle</author></book>

XML

XML Attributes:<book> <title language=“en”>The very hungry caterpillar</title> <author id=“http://id.loc.gov/authorities/names/n80039691”> Eric Carle </author></book>

XML

Transforming between MARC and XML

How is metadata implemented?

• Standards and schemas• Data formats (most often XML, MARC)

• Crosswalks• Harvesting• Best practices• Linked data: an emerging technology

Linked Data: an emerging technology

“Linked Data is a method of exposing, sharing, and connecting data on the Semantic Web using URIs and RDF.” (Dublin Core User Guide)

Dublin Core User Guide: http://wiki.dublincore.org/index.php/User_Guide

Linked Data: an emerging technology

Tim Berners-Lee’s 4 principles for linked data:1. Use URIs as names for things2. Use HTTP URIs so that people can look up

those names.3. When someone looks up a URI, provide useful information, using the standards (RDF,

SPARQL)4. Include links to other URIs so that they can discover more things.

http://www.w3.org/DesignIssues/LinkedData.html

How is linked data represented?Each bit of description (statement) is a triple.

Image: http://wiki.dublincore.org/index.php/File:Diagram1.jpeg

The (predicate) of (subject) is (object) .

How is linked data represented?

Image: http://wiki.dublincore.org/index.php/User_Guide

Each bit of description (statement) is a triple:The _________ of _________ is ________ .

How is linked data represented?

Image: http://wiki.dublincore.org/index.php/User_Guide

Triple: The (predicate) of (subject) is (object). The title of this particular book is “A Christmas Carol”.

Linked Data: What’s the point?

Image: http://wiki.dublincore.org/index.php/User_Guide

• Better interoperability between systems• More machine-actionable data• More powerful applications based on data

In the library world, there’s a lot of interest in linked data, but the applications and data formats are not quite there yet.

Metadata: An overview

• What is metadata?• Different metadata standards for different

types of resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?

What do metadata librarians do?

• Choosing, designing, applying metadata for local projects

• Maintaining existing metadata systems (incl. traditional cataloging, in some cases)

• Working with other librarians and content experts

• Repurposing data• Keeping existing stuff working while getting new

stuff off the ground – decision-making.

Day-to-day for me

• Investigating technology for the libraries– Discovery service implementation

• Metadata projects– Rensselaer Digital Collections– Discovery service metadata

• Cataloging • Link resolver (Ex Libris SFX)• Exploring: data curation, linked data

Metadata: An overview

• What is metadata?• Different metadata standards for different

types of resources and user groups• Different types of metadata• How is metadata implemented?• What do metadata librarians do?– Case study: Thesis metadata

Case study: Thesis metadataStudent deposits thesis

Grad school approves thesis

Library processes / edits thesis and metadata

Library ingests thesis into repository

Library converts thesis metadata for catalog

• Rights chosen (Creative Commons or Standard)– Determines rights statement, who can access

• Initial metadata entry

Student deposits thesis

• Cleared to graduate?• Documents correct?

Grad school approves thesis

• File processing• Apply correct IP restrictions for access rights• Edit metadata to standards

Library processes / edits thesis and metadata

Faculty names

• Regularized – kind of like authority control– Use same name each time

• Maintain with export of existing thesis data, XSLT

• Thesis appears in live repository soon after

Library ingests thesis into repository

• OAI-PMH harvest• Conversion from Dublin Core to MARC – MarcEdit and XSLT

• Load records into catalog

Library converts thesis metadata for catalog

• OAI-PMH harvesthttp://digitool.rpi.edu:8881/OAI-script?verb=ListRecords&set=GEN01:ETD01&metadataPrefix=oai_dc&from=2013-05-01

http://digitool.rpi.edu:8881/OAI-script?verb= ListRecords&set= GEN01:ETD01&metadataPrefix= oai_dc&from= 2013-05-01

Library converts thesis metadata for catalog

Metadata for scanned theses:generated in reverse (MARC to DC)

Theses scanned

Catalog records (MARC) exported

Catalog records converted to DC

Theses, DC metadata ingested into repository

Library edits catalog data to show e-copy

How is metadata implemented?(RPI thesis workflow)

• Standards and schemas– Dublin Core, MARC, AACR2, ETD-MS (sort of)

• Data formats (most often XML)– MARC, XML

• Crosswalks– DC -> MARCXML, MARCXML (standard) -> MARC

• Harvesting– OAI-PMH

• Best practices– ETD-MS data elements (sort of)

Questions?

Katie DunnTechnology & Metadata LibrarianRensselaer Polytechnic [email protected]

This presentation, handout, and links available at:tinyurl.com/IIST602metadata

Help me improve my teaching! Answer 3 quick questions:tinyurl.com/metadata2013