worldcat growth & quality: vision and practice
DESCRIPTION
WorldCat Growth & Quality: Vision and Practice. Asia Pacific Regional Council 2010. April 15, 2010. Ted Fons Director WorldCat Global Metadata Network. OCLC The world’s libraries. Connected. More collaboration More institutions More Web-scale More synchronization More innovation. - PowerPoint PPT PresentationTRANSCRIPT
WorldCat Growth & Quality: Vision and Practice
Ted FonsDirector
WorldCat Global Metadata Network
Asia Pacific Regional
Council 2010April 15, 2010
OCLCThe world’s libraries. Connected.
More collaborationMore institutionsMore Web-scaleMore synchronizationMore innovation
Local
Group
Global
More
Better
Union Catalogue – Pivotal Role
blogsRepositories, various sites
WorldCat Growth – Growing WorldCat Faster
Create system-wide efficiencies in library management
WorldCat Growth since 1998
Millions of records
1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 20100
20
40
60
80
100
120
140
160
39 41 44 47 50 52 5561
67
86
108
139
170
WorldCat Growth = Batch Services
New Records Created Records
Enriched
010,000,00020,000,00030,000,00040,000,00050,000,000
Sources of WorldCat Growth FY09
Batch Services Online CatalogingQuality Control
WorldCat Growth – Is It Working?
Local
Group
Global
Create system-wide efficiencies in library management
WorldCat Today
170 million records
1.5+ billion holdings
1 January 2010
Create system-wide efficiencies in library management
Files loaded or pending for WorldCat
ABES (France)
Bavarian State Library
Bibliothek Alexandrina (Egypt)
Bibliothekszentrum Baden Württemberg (Germany)
British Library
DANBIB (Denmark)
GBV (Germany)
HeBIS (Germany)
IDS Informationsverbund Deutsch-Schweiz (Switzerland)
Lebanese American University
LIBRIS (Sweden)
Qatar University
UnityUK
Zayed University Consortium (UAE)
Create system-wide efficiencies in library management
National files loaded or pending for WorldCat
Bibliothèque nationale de France
German National Library
Libraries Australia
National Central Library, Taiwan
National Library Board, Singapore
National Library of Barbados
National Library of China
National Library of Finland
National Library of Israel
National Library of Mexico
National Library of New Zealand
National Library of Scotland
National Library of Spain
National Library of Sweden
National Library of Wales
Swiss National Library
1998
36%
2009
53.8%
Percentage of Non-English RecordsTotal Records
EnglishFrenchGermanSpanishJapaneseRussianChineseItalianLatinPortugueseDutchHebrew
199837.5m records
23.9 m2.3 m2.2 m1.6 m
.8 m
.8 m
.7 m
.7 m
.3 m
.3 m
.2 m
.2 m
2009117.2 m records64.3 m
8.5 m17.9 m
4.5 m2.8 m2.3 m4.3 m2.1 m1.9 m1.1 m2.9 m1.2 m
Create system-wide efficiencies in library management
Multilingual WorldCat
1.9 billion items and growing!
170 million bib records3.6 million digital items1.5 billion holdings
325 million electronic database recordsNEW! JSTOR Metadata: 4.5 million records
30 million items(Google, HathiTrust, OAIster)
Physical holdings in WorldCat
Licensed digital content in library
collectionsLocal library content
being digitized
Create system-wide efficiencies in library management
The collective collection
OAIster
WorldCat Growth – Synchronization with
Libraries, Repositories & Metadata Hubs
Local
Group
Global
Growing WorldCat Faster
New Data Ingest Platform under Services Oriented Architecture
Partner
DataAutomatic Evaluation
Automatic Manipulation
Automatic Processing
Not
Good
Data
Good
Data
Create system-wide efficiencies in library management Using Publisher Data to Grow WorldCat
Establish partnerships with publishers
Ingest publisher and vendor metadata in ONIX
Enhance publisher metadata Enrich WorldCat with
publisher metadataOutput enhanced ONIX data
to publishers/other partners
http://www.oclc.org/partnerships/material/nexgen/nextgencataloging.htm
Metadata Services for Publishers
Publisher
Central Library
District Library
Tech School Library
Book Seller
Bib Data
Enriched Bib
Data
Enriched Bib
Data
Real Time Update & Record Enrichment
Union Catalogue
Central Library
District Library
Tech School Library
Territory Library
Design School Library
Commenced
Records Holdings Merge % Changes
02.2008 573,854 2,850,000 20-30% c.30,000 /mth
02.2009 159,741 1,130,000 50-60% Not yet
SRU
SRU
Union Catalog
WorldCat Growth – Syndication
Local
Group
Global
Create system-wide efficiencies in library management OCLC and Google to exchange data, link digitized books to WorldCat
Synchronizes WorldCat with digital collections of interest to the membership
Participating organizations provide OCLC with a regular feed of metadata
WorldCat is automatically updated with new MARC records as materials become available
Reciprocal linking between WorldCat and the host site
Automatic
Create system-wide efficiencies in library management OCLC Syndicates WorldCat Data with Google Books
WorldCat Quality –Improving the Quality of the Database
Local
Group
Global
Quality Control Activities
FY08 FY09
Bib Records Replaced 2,105,325 6,804,903
Manual Merges 207,742 137,832
Authority Records Replaced
395,817 1,112,815
Change Requests Received
182,348 134,902
Automated Enrichment of Master Records
LC-ty
pe ca
ll num
ber
Dewey-
type c
all nu
mber
Other c
all nu
mber
Conten
ts/Sum
maries
Subje
ct ter
msURLs
0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
2009 Oct Actual2010 Jan Actual2009 July Actual2010 Apr Actual2010 July Target
Reducing Duplicates – An Improved Algorithm
First production run, May 18, 2009Running small files (500 – 3000)Statistics for May & June 2009
33,023 records processed1,777 duplicates removed (5.7%)846 records deferred for manual review
Reducing Duplicates – An Improved Algorithm
Full production run, Feb. 2, 2010Entire Database, beginning with OCN #1Statistics so far:
7.5 Million records processedAlmost 650,000 records removed
Unique fields from deletes merged into the master record
The exception to this is non-Latin fields. We try to ensure that all non-Latin fields are in the retained record even if they are not on the list of mergeable fields.
Expert Community Experiment
Experiment to test “social cataloging” with OCLC’s expert community (modeled on Wikipedia)
Interest and motivation from WorldCat Local libraries that want to use WorldCat Local as their “database of record”
Total Replaces = 108, 766
February (15-28) March April May June July August (1-15)0
5,000
10,000
15,000
20,000
25,000
5,816
18,01019,489
16,704
19,38720,287
9,073
Expert Community
What are they saying?
“I am loving the ability to fix typos, add more subject headings, etc. Some of which were things I would do locally but were too much of a hassle to fix at the oclc level.”
“Thank you so much for the opportunity to participate in the community enhancement experiment! Having the ability to correct typos, flesh out minimal…cataloging…is really wonderful. I hope the experiment works out well … -- I would love to see it made a permanent feature”
WorldCat is much more than a warehouse of records
•Continuous improvement of WorldCat records by members:•Enhance •Record enrichments•Expert Community Experiment•Error reporting
•OCLC’s quality management role:•WorldCat Quality group•Automated record enrichment•FRBRization•Duplicate detection and resolution•Support for Program for Cooperative Cataloging – NACO, CONSER, BIBCO, etc.•Ongoing conformance to library standards
A partnership of members and OCLC
WorldCat Quality– Let’s Probe What “Quality” Really Means
Local
Group
Global
Online Catalogs: What Users and Librarians Want
End-Users expect online catalogs:to look like popular Web sitesto have summaries, abstracts, tables of contentsto help find needed informationLibrarians expect online catalogs: to serve end users’ information needsto help staff carry out work responsibilitiesto have accurate, structured datato exhibit classical principles of organizationhttp://www.oclc.org/us/en/reports/onlinecatalogs/default.htm
Recommended enhancements to WorldCatTotal end-user responses
End-User Results: Recommended Enhancements
4
Librarian/Staff Results: Highlighted Differences
14
1
What did we learn?End-user focus group results
Key observations:• Delivery is as important, if not more important, than
discovery. • Seamless, easy flow from discovery through delivery is
critical.• Summaries and tables of contents are key elements
of a description• Improved search relevance is necessary.
WorldCat Registry – Enabling Services
Local
Group
Global
Metadata about Libraries
WorldCat Registry
• A repository of metadata about libraries:
• Location• Contacts• Policies• Links
WorldCat Registry Value Proposition
The WorldCat Registry allows your library to:
• Provide direct linking to local library services over a variety of OCLC products including WorldCat.org and WorldCat Local
• Create and manage a profile that centralizes and automates information sharing with vendors and OCLC
• Receive a free benefit of greater internet visibility regardless of the OCLC membership
worldcat.org/registry/institutions
Registry Growth 2007-2009
2007• 70, 000 records
• some library users
• 20,000 requests/mo via OpenURL Gateway
2009• 130,000 records• Over 4,500 library
users managing records
• Processing 200-300,000 requests/mo via OpenURL Gateway
• Multiple OCLC and non-OCLC Services that rely on this data
Bringing It All Together: RedLaser App
http://redlaser.com
The WorldCat knowledgebase
The Vision
Achieve web scale for KB servicesMove the KB to the cloudProvide KB services through an API model to:
• Provide a central platform for KB data management• Allow read and write access to the KB within OCLC
services• Allow read and write access to the KB for external
services
The KB can be managed in one place, but exposed anywhere
Web scale value proposition
70%
30%INFRASTRUCTUR
EINITIATIVE
Amazon.com: http://www.slideshare.net/goodfriday/amazon-web-services-building-a-webscale-computing-architecture
Cloud Computing
A style of computing in which scalable and elastic IT-enabled capabilities are delivered as a service to external customers using Internet technologies.
-Gartner Group
Simple: Web-based applications with shared data and services.
Traditional KB Services
The traditional model for KB services is to build a KB to support a service or product
KB
KB
Powering the library
Link Resolver
ERM
A-Z
KB
MetasearchKB
More power
A-Z
ERM Link Resolver
MetasearchKB
Users Suppliers
Partners
Efficient storage of data in the cloud:Common use data
Bib
Holdings
UserData
Common Use Data
Library
Users Suppliers
Partners
Efficient storage of data in the cloud:Common use data
Titles
Holdings
UserData
Common Use KB Data
Library
Collections
The Vision
Achieve web scale for KB servicesMove the KB to the cloudProvide KB services through an API model to:
• Provide a central platform for KB data management• Allow read and write access to the KB within OCLC
services• Allow read and write access to the KB for external
services
The KB can be managed in one place, but exposed anywhere
WorldCatLinks Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
Link Resolver
KB
Citation
Science DirectEbscoGale
Available in:KBWC API
Traditional Link Resolver Model
CollectionsTitlesHoldingsLinks
WorldCat
Third Party
WC Resource Sharing
WorldCat.org
TouchpointWC Local
Links Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
CollectionBuilder
WMSLicense
ManagerWMSERM
WorldCat Knowledgebase model
KBWC API
KBWC API
WorldCat
Links Holdings
Collections Titles
KBWC
Citation
LendersLinksRights
WMSLicense
Manager
WC Resource Sharing
KBWC API
WorldCat
WorldCat.org
Links Holdings
Collections Titles
KBWC
CollectionBuilder
WMSLicense
ManagerWMSERM
LinksRightsFilters
KBWC API
WorldCat
WorldCat.org
Links Holdings
Collections Titles
KBWC
CollectionBuilder
WMSLicense
ManagerWMSERM
Third Party
KBWC API
WorldCat
Third Party
WorldCat.org
Links Holdings
Collections Titles
KBWC
WorldCat Link Manager
A&I Database
CollectionBuilder
WMSLicense
ManagerWMSERM
WorldCat Knowledgebase model
WC Resource Sharing
TouchpointWC Local
WorldCat Growth – The Value of the Cooperative
Local
Group
Global
The Value of the Shared WorldCat Network
• An incomparable source of library-standard records to support local or group library discovery and collection management.
Record supply
• Bibliographic and holdings data from more than 70,000 libraries, underpinning delivery of library collections, resource sharing, and collection analysis.
Registration of holdings
• An infrastructure utilizing library standards for description, name authority control, classification, and terminologies.
Knowledge organization
Record Supply: Where do WorldCat records come from?
The cooperative provides the content.
The cooperative activity provides the value.
Cataloging: Key to All OCLC Services
WorldCat
Find & Get Items
Discovery
Holdings & AvailabilityResource Sharing
Bibliographic
Descriptions
Cataloging & Metadata
Services
Library Description
sWorldCat Registry
OCLCThe world’s libraries. Connected.
More collaborationMore institutionsMore Web-scaleMore synchronizationMore innovation
Local
Group
Global
More
Better
WorldCat Growth & Quality: Vision and Practice
Ted FonsDirector
WorldCat Global Metadata Network
Asia Pacific Regional
Council 2010April 15, 2010