australian newspapers service progress and search and delivery system nov 2008
TRANSCRIPT
1
Australian Australian NewspapersNewspapersDigitisation ProgramDigitisation Program
Overview of ProgressOverview of ProgressMarch 2007 March 2007 –– November 2008November 2008
and the public search and delivery systemand the public search and delivery system
Rose Holley Rose Holley –– Manager ANDPManager ANDP5 November 2008, National Library of Australia5 November 2008, National Library of Australia
Presentation to the National Library of Indonesia Delegation.Presentation to the National Library of Indonesia Delegation.
2
Increase access to Australian newspapers Increase access to Australian newspapers
Build a national service that will provide free online Build a national service that will provide free online access from the first Australian newspaper published access from the first Australian newspaper published in 1803 through to the end of 1954 in 1803 through to the end of 1954
Key Features of the serviceKey Features of the serviceOnline accessOnline accessFreely availableFreely availableFull text searchableFull text searchable
Objectives
3Website: http://www.nla.gov.au/ndp
4
National ContentNational Content
Initial focus on Initial focus on major titles from major titles from each state and each state and territoryterritory
Anticipate that Anticipate that ‘‘regionalregional’’ titles may titles may be contributed laterbe contributed later
Coverage: published Coverage: published between 1803 between 1803 –– 19541954(out of copyright)(out of copyright)
West Australian
Northern Territory Times
Courier Mail
Advertiser
Sydney Morning Herald
Sydney Gazette
Argus
Mercury
Canberra Times
5
Coverage 1803 ‐ 1954
6
State Newspaper Titles
7
$1 Million Grant from the Vincent Fairfax Family $1 Million Grant from the Vincent Fairfax Family Foundation to digitise Foundation to digitise The Sydney Morning Herald The Sydney Morning Herald to 1954to 1954
8
1.5 million newspaper pages digitised from microfilm 1.5 million newspaper pages digitised from microfilm
Pilot phase completed Pilot phase completed –– Optical Character Recognition Optical Character Recognition (OCR) and content analysis of 50,000 pages(OCR) and content analysis of 50,000 pages
Prototype search and delivery service developed and Prototype search and delivery service developed and tested with Australian state libraries tested with Australian state libraries
Beta search and delivery service available with 360,000 Beta search and delivery service available with 360,000 pages (3.5 million articles)pages (3.5 million articles)
Progress November 2008…
9
Microfilm converted to digital imagesMicrofilm converted to digital images
The Process
10
Check images on reels
11
Quality Assurance on each pageQuality Assurance on each page
12
Page Page sequencesequence
Metadata Metadata creationcreation
MissingMissingpage page targetstargets
13
Optical Character Recognition (OCR) of pages and article zoningOptical Character Recognition (OCR) of pages and article zoning
14
OCR In India
15
Accessing the NewspapersAccessing the Newspapers
Beta service now availableBeta service now availableContains 360,000 pagesContains 360,000 pagesOpen for public use and feedbackOpen for public use and feedbackScreenshots followScreenshots follow
16Home page
17Search results
18Page view
19Article View
20Correct OCR
21
22Add tags to articles
23
Tag cloud
24Add note/comment to article
25Title information and browse
26Title list
27Feedback
28Website: http://www.nla.gov.au/ndp