frbr information exchange thomas hickey & jenny toves oclc research

14
FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Upload: everett-anthony

Post on 03-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

FRBR information exchange

Thomas Hickey & Jenny TovesOCLC Research

Page 2: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Current FRBR information exchange

Sets of MARC-21 records• Both bibliographic and authority• Sometimes extended

pKeys Unique pKeys Lists of sets of control numbers xISBN web service superWork records

Page 3: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Some background

Our FRBRization has been done primarily at the work level• We have FRBRized OCLC WorldCat

• ~60,000,000 records• ~1,000,000,000 holdings• Used in Open WorldCat, FictionFinder now• Will be visible in FirstSearch displays this fall

• Norwegian BIBSYS records• Finish national bibliography (now in WorldCat)• Electronic thesis metadata

Processing done on a 24-node Beowulf Linux cluster

Page 4: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

MARC 21 bibliographic data

Basic method of accepting information Other formats get mapped into it Fields we use:

• Author main entry• Titles• ISBN• Personal name added entries• Language

Extensions• BIBSYS use of 490 fields to indicate hierarchy

Page 5: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

MARC 21 Authority data

Map personal names using cross references Map author-titles using cross references Fields we currently use

• 008 fixed field• 100, 130, 400

Extensions• Files of additional cross references

• Common title patterns• xISBN matching

Page 6: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

pKeys

An author-title key for matching Derived from MARC-like records & authority data

ocm00019613 shakespeare, william\1564 1616/hamletocm00615676 /hamlet/shakespeare, william\1564 1616ocm14055779 hamlet motion picture 1948ocm00290352 /hamlet/ocm00290352

Page 7: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Unique pKeys

pKeys that have been sorted and counted

692 sw00008899 milton, john\1608 1674/poems

691 sw00255854 puccini, giacomo\1858 1924/tosca

690 sw00020874 chaucer, geoffrey\d 1400/canterbury tales

688 sw00237074 melville, herman\1819 1891/moby dick

682 sw03620985 china/laws etc

Page 8: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Lists of control numbers

sw00000089 00206765 01261413 00000089 01236648 03975229 08360541 07363127

sw00000169 00000169 01647333 00420563 10957239 05205626 02325844 07299473 08244692 08555721 24509677 02533498 03967788 24728032 10130242 04849080 09477230 23323184 22051264 38870301 54266609 56760701 08366329

sw00000182 00000182 00102731 sw00000201 00000201 02786659 sw00000210 00000210 09175561 sw00000245 00000245 34103639

Page 9: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

xISBN web service

Takes an ISBN as input Returns list of ISBNs in associated work Significant processing

• Starts with control-number list of work-sets• Uses ISBNs to pull work-sets together• Allows fuzzy-matching on author/title• Ends up with consistent clusters

• In general larger than those in control-number list

Page 10: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

xISBN examples

[0130188549, 0130188476]:

sw11067396 barnea, amir/agency problems and financial contracting

sw13096363 barnea, amir/agency problems on financial contracting

[000713407x, 0007126360, 0007134053, 0007134061, 0007126441]:

sw48486275 /collins new school dictionary/ocm48486275

sw49740193 /collins new school dictionary/ocm49740193

sw49740203 /collins new school dictionary/ocm49740203

Page 11: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

xISBN XML response

<?xml version="1.0" encoding="UTF-8" ?> - <idlist>   <isbn>000713407x</isbn>   <isbn>0007126360</isbn>   <isbn>0007134053</isbn>   <isbn>0007134061</isbn>   <isbn>0007126441</isbn>   </idlist>

Page 12: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

superWorks format

Developed for FictionFinder XML format Includes expression-level information

• All the information needed We are adapting it to the Curioser project

Page 13: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

superWork record layout

pKey # manifestations, holdings, sw-id, control #s publication dates expressions

• expression• classes• language• authors• titles• subjects• components

• author, title, publication data

Page 14: FRBR information exchange Thomas Hickey & Jenny Toves OCLC Research

Summary

Simpler when only work-level relationships are needed

Even for work-level relationships, a number of different formats are useful

Information needed for an interface gets much more complicated