1 cs 430: information discovery lecture 5 descriptive metadata 1 library catalogs marc

24
1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

Upload: eustacia-osborne

Post on 11-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

1

CS 430: Information Discovery

Lecture 5

Descriptive Metadata 1

Library CatalogsMARC

Page 2: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

2

Course Administration

Page 3: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

3

Examples of Library Catalogs

Cornell University Library catalog:

http://catalog.library.cornell.edu/

Library of Congress, Prints and Photographs:

http://www.loc.gov/rr/print/catalog.html

Page 4: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

4

Descriptive Metadata

• Catalog: metadata records that have a consistent structure, organized according to systematic rules.

• Abstract: a free text record that summarizes a longer document.

• Indexing record: less formal than a catalog record, but more structure than a simple abstract.

Some methods of information discovery search descriptive metadata about the objects.

Metadata typically consists of a catalog or indexing record, or an abstract, one record for each object.

Page 5: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

5

Descriptive Metadata

• Usually stored separately from the objects that it describes, but sometimes is embedded in the objects.

• Usually the metadata is a set of text fields.

Textual metadata can be used to describe non-textual objects, e.g., software, images, music

Page 6: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

6

Descriptive metadata

Information discovery is often most effective when applied to metadata rather than raw information

• Allows fielded searching

author = "Goethe"

• Suitable for non-textual material

type = "picture" and subject = "Ithaca"

• Can be used with controlled vocabulary

language = "en"

Page 7: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

7

Origins of Library Catalogs

Bibliographic Objective:

• To bring together like items

• To differentiate among similar ones

Sir Anthony Panizzi, Keeper of Books at the British Museum (1856-67).

His Ninety-One Rules (1841) were the basis of modern catalogue rules.

Page 8: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

8

Origins of Library Catalogs

Information Discovery:

• to enable a person to find a book of which either the author, title or subject is known

• to show what the library has by a given author, on a given subject, or in a given kind of literature

• to assist in the choice of a book as to its edition (bibliographically) or to its character (literary or topical).

Charles Ammi CutterLibrarian of the Boston Athenaeum

Rules for a Dictionary Catalog, 1874

Page 9: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

9

Origins of Library Catalogs

Classification:

Division of subject matter into a hierarchy. Typically used in libraries to provided a subject-based order for shelving books.

Melvil DeweyActing Librarian of Amherst College (1874)

Dewey Decimal system of book classification, uses the numbers 000 to 999

to cover the general fields of knowledge and decimals to fit special subjects.

Page 10: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

10

Technology

Materials to be catalogued:

• Originally books

• Extended to serials, maps, music, etc., but concepts still rely heavily on experience with books

Form of catalog:

• Entries in books (Panizzi)

• Index cards (Cutter)

• Online databases (Kilgour)

Page 11: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

11

Catalogs as Investments

Costs:

• Conventional Catalog Records are created by skilled librarians. (cost estimate $100 per record).

• OCLC's catalog has 43 million records. Total investment is several billion dollars.

Cataloguing Standards:

• Enable libraries to share records

• Combine records of the past with records created today

• Allow readers and librarians to move between libraries

Page 12: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

12

Library Cataloguing

Anglo American Cataloguing Rules (AACR2)

• rules for what goes into each field of a catalog record

MARC format

• an exchange format for catalog records

"MARC Catalog"

• catalog in MARC format, where content of each field follows AACR2

Page 13: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

13

Example: Monograph catalog record

Citation

Caroline R. Arms, editor, Campus strategies for libraries and electronic information. Bedford, MA: Digital Press, 1990.

Page 14: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

14

MARC fields

tag value

001 89-16879 r93

050 Z675.U5C16 1990

082 027.7/0973 20

245 Campus strategies for libraries and electronic title statement information/Caroline Arms, editor.

260 {Bedford, Mass.} : Digital Press, c1990. publisher

300 xi, 404 p. : ill. ; 24 cm. collation440 EDUCOM strategies series on information technology series title

504 Includes bibliographical references (p. {373}-381).

020 ISBN 1-55558-036-X : $34.95

Page 15: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

15

MARC fields (continued)

650 Academic libraries--United States--Automation. subject heading

650 Libraries and electronic publishing--United States.

650 Library information networks--United States.

650 Information technology--United States.

700 Arms, Caroline R. (Caroline Ruth)

040 DLC DLC DLC

043 n-us---

955 CIP ver. br02 to SL 02-26-90

985 APIF/MIG

Page 16: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

16

MARC Encoding

tag: 260

subfield a: {Bedford, Mass.} :

subfield b: Digital Press,

subfield c: c1990.

MARC encoding:

&2600#abc#{Bedford, Mass.} :#Digital Press,#c1990.%

Page 17: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

17

Name authority files

• Caroline R. Arms or Caroline Ruth Arms?

• Which William Phillips of Cardiff?

• Mark Twain or Samuel Clemens?

• Epithets:

of Cardiffdoctor

• Dates:

1832 - 1876flourished 1860 circa 1832 - 1876

Page 18: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

18

Shared cataloguing

OCLC -- Large centralized transaction processing database system

When a library catalogs a book it deposits MARC record in OCLC

Other libraries can copy the record

• saves duplication of cataloguing

• build database of holdings

OCLC database has 43 million records

Page 19: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

19

Subject information

Library of Congress Subject Headings

Academic libraries--United States--Automation

Hierarchical classification

Library of Congress call number: Z675.U5C16

Dewey Decimal Classification: 027.7

Creation and maintenance of lists of subject headings and classifications is a never ending task.

Page 20: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

20

Online public access catalog (OPAC)

First stage

• Library mounts its MARC records on a central computer

• Provides a simple terminal interface and dedicated terminals

• Boolean search -- fielded searching

[Most university libraries reached this stage about 1990]

Second stage

• Library connects computer to a campus network and Internet

• Converts card catalog records to MARC (retrospective conversion)

Page 21: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

21

Library information systems

When the catalog is online ...

Add other collections and services:

• Secondary information (Inspec, Medline, Chemical Abstracts)• Reference works (dictionaries, encyclopedias)

Improve user interface

• Add full text searching• Add web interface

Add connections to off-campus information sources:

• Scientific journals• Databases (census, genome)

Page 22: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

22

Library management systems

A library management system, sometimes called an integrated library system, integrates the internal processes of a library, e.g., acquisitions, cataloguing, binding, circulation, etc.

It usually contains an online public access catalog, but does not provide integrated services to users.

Library management systems are produced by small companies who lack the capital and technical expertise to develop modern digital libraries.

Page 23: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

23

Notes on MARC

A great achievement:

• Developed in 1960s

• Magnetic tape exchange format for printing catalog records

• The dawn of computing:

mixed upper and lower casevariable length fields, repeated fieldsnon-Roman scripts

• 100(?) million records with standard content and format

• Thousands of trained librarians (millions?)

Page 24: 1 CS 430: Information Discovery Lecture 5 Descriptive Metadata 1 Library Catalogs MARC

24

Notes on MARC

A great problem:

• Not designed for computer algorithms

• One record per item (poor links between records)

• Tied to traditional materials and traditional practices

• Not Unicode

• 100 of million records at $100 -- $10 billion

A classic legacy system!