automated creation of analytic catalog records for born digital journal articles

40
Automated creation of analytic catalog records for born-digital journal articles Kevin S. HAWKINS @KevinSHawkins

Upload: nasig

Post on 11-Nov-2014

393 views

Category:

Education


1 download

DESCRIPTION

This presentation will summarize the approach to bibliographic metadata developed at the University of Michigan Library for journal articles published and archived in HathiTrust using the mPach toolset, which allows journal editors to create born-digital open-access journals and create their own metadata as a byproduct of the publishing process. Specifically, mPach allows a journal editor to convert edited manuscripts from common source formats such as Microsoft Word into JATS (Z39.96-2012) XML and embed structured metadata about the article and journal. Since HathiTrust currently uses MARC as its common-denominator metadata format, JATS metadata are automatically mapped to MARC fields, creating one analytic record per article but without normalizing to follow RDA rules for transcription from primary sources of information or creating entries according to name authorities. For each new journal, a serial record for the journal is created manually by a serials cataloger. This serial record and each analytic record for articles in that journal link to a "collection" for the journal built using the HathiTrust Collections feature. See accompanying handout at http://www.slideshare.net/NASIG/automated-creation-of-analytic-catalog-records-for-born-digital-journal-articleshandout Presenter: Kevin S. Hawkins Director of Library Publishing, University of North Texas Denton, Texas

TRANSCRIPT

Page 1: Automated creation of analytic catalog records for born digital journal articles

Automated creation of analytic catalog records for born-digital journal articles

Kevin S. HAWKINS@KevinSHawkins

Page 2: Automated creation of analytic catalog records for born digital journal articles

2

Digital Library Production Service

Page 3: Automated creation of analytic catalog records for born digital journal articles

3

Page 4: Automated creation of analytic catalog records for born digital journal articles

4

Page 5: Automated creation of analytic catalog records for born digital journal articles

5

Page 6: Automated creation of analytic catalog records for born digital journal articles

6

Page 7: Automated creation of analytic catalog records for born digital journal articles

7

inspired

formed the basis of

Page 8: Automated creation of analytic catalog records for born digital journal articles

8

Page 9: Automated creation of analytic catalog records for born digital journal articles

9

Page 10: Automated creation of analytic catalog records for born digital journal articles

10

Page 11: Automated creation of analytic catalog records for born digital journal articles

11

Opportunities

• HathiTrust– offers a better infrastructure for development

than DLXS– is certified by Trustworthy Repositories Audit &

Certification (TRAC)• There’s growing interest among institutions in

building a shared infrastructure for publishing.

Page 12: Automated creation of analytic catalog records for born digital journal articles

12

mPach: what are we creating?

• modular platform• tightly coupled with the HathiTrust repository• for open-access journals• all you need to publish and preserve an OA

journal• will integrate with Open Journal Systems (OJS)

Page 13: Automated creation of analytic catalog records for born digital journal articles

13

Page 14: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (1 of 8)

Page 15: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (2 of 8)

Page 16: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (3 of 8)

Page 17: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (4 of 8)

Page 18: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (5 of 8)

Page 19: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (6 of 8)

Page 20: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (7 of 8)

Page 21: Automated creation of analytic catalog records for born digital journal articles

mPach Prepper (8 of 8)

Page 22: Automated creation of analytic catalog records for born digital journal articles
Page 23: Automated creation of analytic catalog records for born digital journal articles
Page 24: Automated creation of analytic catalog records for born digital journal articles

Questions so far?

Page 25: Automated creation of analytic catalog records for born digital journal articles
Page 26: Automated creation of analytic catalog records for born digital journal articles
Page 27: Automated creation of analytic catalog records for born digital journal articles

HathiTrust’s Bibliographic Metadata Specifications

When a HathiTrust partner institution provides a digital object for inclusion in HathiTrust, it must provide a catalog record in MARCXML format using fields as defined in the Bibliographic Metadata Specifications, an extension of MARC 21 minimal-level requirements.

Page 28: Automated creation of analytic catalog records for born digital journal articles

What is the repository unit (barcode equivalent) for born-digital journals?

an individual article

Page 29: Automated creation of analytic catalog records for born digital journal articles

But …

There is also metadata that relates to the journal as a whole, such as:• title of the journal• name of the publisher• place of publicationWhat to do with these?

Page 30: Automated creation of analytic catalog records for born digital journal articles

mPach’s solution: creating two kinds of records conforming to HathiTrust’s Bibliographic Metadata SpecificationsSerial record for the journal

Analytic record for each article

Created manually Created automatically by mPach’s Prepper

Page 31: Automated creation of analytic catalog records for born digital journal articles

Workflow for manual creation of serial records (1/2)

When a new journal comes along that will use mPach:1. Journal editor fills out a form that asks for:– journal title– any alternative titles or abbreviations– any previous titles– any ISSNs related to the journal– a short description of the scope of the journal

Page 32: Automated creation of analytic catalog records for born digital journal articles

Workflow for manual creation of serial records (2/2)

2. A serials cataloger will check to see if the HathiTrust catalog already contains a record for the journal (or for any previous titles). They will be modified, a new record will be created, or both—linking to the journal’s homepage.

Page 33: Automated creation of analytic catalog records for born digital journal articles

Full view v. 4 (2014) - (original from University of Michigan)

Page 34: Automated creation of analytic catalog records for born digital journal articles
Page 35: Automated creation of analytic catalog records for born digital journal articles

So can users only discover articles by way of the journal homepage?

Nope!The analytic records for each article will also be in

the HathiTrust catalog, so you can find articles directly (if, for example, you search the catalog for

a known article title).

Page 36: Automated creation of analytic catalog records for born digital journal articles

Automatic creation of article records (1/2)

To review, the user (e.g., the journal editor) uses mPach’s Prepper to prepare an article for ingest into HathiTrust.

A combination of paragraph styles in Microsoft Word and manually entered metadata in Prepper ensures that the bibliographic metadata is properly encoded in JATS XML.

Page 37: Automated creation of analytic catalog records for born digital journal articles

Automatic creation of article records (2/2)

So because we’ll have data that is correctly structured and actually correct, we will be able to map from JATS XML to the fields required to create an analytic MARCXML record for the article.

Each analytic record will be created automatically at the time that an article is ingested.

Our crosswalk, developed with significant assistance from Steven Holloway at ATLA, was donated to the JATS community on the JATS wiki.

Page 38: Automated creation of analytic catalog records for born digital journal articles

But how good are these records? Do they follow AACR2 or RDA?

Not in the following ways:• Records will not have titles of articles transcribed

according to AACR2/RDA; instead, they will be in the record as displayed in the article.

• Names will be handled as the mPach user spelled them and divided them into forenames and surnames.

• We haven’t bothered with choosing a main entry: all access points are added entries.

Page 39: Automated creation of analytic catalog records for born digital journal articles

For anyone interested, I have an annotated handout of a working document showing

how the analytic and serial records will relate to each other and the other

components of mPach.

Page 40: Automated creation of analytic catalog records for born digital journal articles

Questions?

http://www.lib.umich.edu/mpach