europeana newspapers -

48
Europeana Newspapers 9 June 2014 – London– Morning Edition Published by Alastair Dunning, The European Library @alastairdunning,

Upload: alastairdunning

Post on 08-May-2015

1.937 views

Category:

Education


1 download

DESCRIPTION

Building a website to search over historic European newspapers, http://www.theeuropeanlibrary.org/tel4/newspapers/

TRANSCRIPT

Page 1: Europeana Newspapers -

Europeana Newspapers9 June 2014 – London– Morning Edition

Published by Alastair Dunning, The European Library@alastairdunning, www.slideshare.net/alastairdunning

Page 2: Europeana Newspapers -

On 15th April 1912, the passenger ship Titanic, carrying over 2,000 passengers and crew, crashed into an iceberg on

its maiden voyage from Southampton to New York

Page 3: Europeana Newspapers -

Responses to the Titanic Disaster

http://anno.onb.ac.at/cgi-content/anno?aid=nzg&datum=19120417&seite=1&zoom=33

Page 4: Europeana Newspapers -

Responses to the Titanic Disaster

http://kranten.delpher.nl/nl/view/index?query=de+telegraaf+titanic&coll=ddd&image=ddd%3A110546692%3Ampeg21%3Aa0026&page=2&maxperpage=10&sortfield=date

Page 5: Europeana Newspapers -

Responses to the Titanic Disaster

http://gallica.bnf.fr/ark:/12148/bpt6k289555z

Page 6: Europeana Newspapers -

Responses to the Titanic Disaster

http://hemerotecadigital.bne.es/details.vm?q=id:0000817544&s=0

Page 7: Europeana Newspapers -

Responses to the Titanic Disaster

Page 8: Europeana Newspapers -

News travels at different speeds, with importance

that diminishes at different rates.This is true now

as is was in 1912.(though the web changes things …)

Page 9: Europeana Newspapers -

The Europeana Newspapers project is making this kind of

investigation easier

Page 10: Europeana Newspapers -

A cross-searchable newspapers interface at The European Library

(with issue-level metadata forwarded to Europeana)

http://www.theeuropeanlibrary.org/tel4/newspapers

Page 11: Europeana Newspapers -

Currently: Search through full text of

around 2 million pages of full text

By 2015:

10m pages of full text, up to

2m issues

Searching by keyword, and organise by language, date, source library, title

Page 12: Europeana Newspapers -

Currently: Search through metadata records relating

to 1.12m issues – with links to source libraries

By 2015: Search through metadata records relating

to up to 4m issues - with links to source librariesBrowse by date or map

Page 13: Europeana Newspapers -

Full Text from following libraries

•Bibliotheque nationale de France / National Library France•Koninklijke Bibliotheek / National Library of the Netherlands•Landesbibliothek Dr. Friedrich Teßmann / Teßmann Library•Eesti Rahvusraamatukogu / Estonian National Library• Kansalliskirjasto / National Library of Finland• Latvijas Nacionala Biblioteka / National Library of Latvia•Biblioteka Narodowa / National Library of Poland•Milli Kutuphane Baskanligi / National Library of Turkey• Österreichische Nationalbibliothek / Austrian National Library•Staatsbibliothek zu Berlin / Berlin State Library•Staats- und Universitätsbibliothek Hamburg / State and University Library• Univerzitet u Beogradu / University Library of Belgrade

Searching by title

Page 14: Europeana Newspapers -

Issue Level Records from following libraries

•National Library of Wales•St. Cyril and Methodius National Library / The National Library of Bulgaria•National Library of Czech Republic•National and University Library in Zagreb•Koninklijke Bibliotheek van België / Bibliothèque royale de Belgique•Narodna in univerzitetna knjinica / National and University Library of Slovenia•National Library of Portugal•National Library of Romania•Landsbókasafn Íslands - Háskólabókasafn / National and Univeristy Library of Iceland National Library of Spain•Bibliothèque nationale de Luxembourg / National Library of Luxembourg

Finding matching results in single or multiple issues

Page 15: Europeana Newspapers -

Highlighting search terms

Page 16: Europeana Newspapers -

So far, okay. Similar functionality to other national and regional digital

libraries of newspapers

See other archives via:https://www.google.com/maps/ms?msid=217164746645697066594.0004c3d764fcb71ed2314&msa=0

Page 17: Europeana Newspapers -

But what was the user response to an aggregation of European

newspaper libraries ?

Results of Usability Testing: http://www.europeana-newspapers.eu/wp-

content/uploads/2014/05/The-European-Library-Newspaper-Archive-Usability-testing-Report-

April-2014.pdf

Page 18: Europeana Newspapers -

“Aggregated view of content from many sources highly

valued.There was a strong positive

reaction to the availability of the archive.”

Page 19: Europeana Newspapers -

“Many saying they would be keen to return to the site as

the content expands.”

Page 20: Europeana Newspapers -

“Ability to search over geographic map was highly valued”

Page 21: Europeana Newspapers -

Plenty of quibbles about design

- positions of advanced options- re-order list of results- manipulating facets

Page 22: Europeana Newspapers -

Much greater expectations of functionality once logged in

For example,Saved searches

New content notification

Page 23: Europeana Newspapers -

“Much of the value of the site to participants was provided by the images of the documents.

Participants expected to be able to save a 'local' copy once they had located content of

relevance.

As no download facility is provided, this led to some frustration and undermined the overall

potential value of the site for some participants.”

Page 24: Europeana Newspapers -

Timetable for rest of projectNow – Protype version of interface shared with projectThroughout 2014 - Ongoing creation of OCR, and other related technical work (OLR, Named Entities)Throughout 2014 – Live version of website improved / usability testing / added contentAutumn 2014 - Final project conferenceLate 2014 - Newspaper browser completed with content and tools from project

More information athttp://www.europeana-newspapers.eu/

Interface at http://www.theeuropeanlibrary.org/tel4/newspapers/

Page 25: Europeana Newspapers -

Things the users didn’t say(but I thought they would)

Page 26: Europeana Newspapers -

Why can’t I edit the text ?

(Our sample was researchers/ maybe it is other communities interested in crowdsourcing?)

Note: If time permits, The European Library will develop some crowdsourcing feature

Page 27: Europeana Newspapers -

Can I download text for data mining?

Remember: Digital Humanists are still a small percentage of humanists and users

Note: Many of the texts are marked public domain, so this is feasible in legal terms

Page 28: Europeana Newspapers -

Number of digitised pages in interface: c.2m

Number of digitised pages in European libraries: c.130m

Number of physical pages in European libraries: 1.5bn+

Source: European Newspaper Survey Report http://www.europeana-newspapers.eu/wp-content/uploads/2012/04/D4.1-Europeana-newspapers-survey-report.pdf

Page 29: Europeana Newspapers -
Page 30: Europeana Newspapers -

The project digital library is only a fraction of the newspaper archive of the continent, indeed the world

Page 31: Europeana Newspapers -

As libraries, how should we represent that absence to users ?

Page 32: Europeana Newspapers -

Should such absence be represented in the interface itself ?

Page 33: Europeana Newspapers -

Vast white

spaces in the list of results ?

Page 34: Europeana Newspapers -

Provided standardised descriptions of digitised resources ?

Standardised information for every digital resource of presenting collections, content, licencing, re-use

Page 35: Europeana Newspapers -

Charts and graphs external to the interface ?

Page 36: Europeana Newspapers -

There are other issues too OCR quality varies Some pages (2m by 2015) have articles

segmentation Some library content has named entity

extraction effecting search results Different licensing statements from

different countries Date of copyright boundaries different in

each country

Page 37: Europeana Newspapers -

How should we allow users better ways to understand the digital

library ?

Page 38: Europeana Newspapers -

What role can the API play in this?

Can opening up the data in the digital library and allowing it to

explored in different ways ?

Page 39: Europeana Newspapers -

Traditional Model With an API

Interface(Created by Library)

Data(Published by Library)

Interface(Created by Third Party)

Data(Published by Library)

API – Application Programming Interfaces

Page 40: Europeana Newspapers -

Pioneering work of Trove API

Page 41: Europeana Newspapers -

Interface(Created by Library)

Data(Published by

Library)

Trove Newspapers site as published by National Library of

Australia, and based on data provided by Library

http://trove.nla.gov.au/newspaper

Page 42: Europeana Newspapers -

Trove Newspapers statistics develolped by third party, based

on data provided by libraryhttp://wraggelabs.com/shed/trove/graphs/

Interface(Created by Third

Party)

Data(Published by Library)

Page 43: Europeana Newspapers -

Headline Roulette, developed by third party, based on data

provided by libraryhttp://wraggelabs.com/shed/headline-

roulette/

Interface(Created by Third

Party)

Data(Published by Library)

Page 44: Europeana Newspapers -

Word Count of Articles, developed by third party, based

on data provided by libraryhttp://dhistory.org/frontpages/53/words/

Interface(Created by Third

Party)

Data(Published by Library)

Page 45: Europeana Newspapers -

Sounds great !But … ?

Page 46: Europeana Newspapers -

How many people in this audience would now how to build an

interface on top of API?

Page 47: Europeana Newspapers -

How many users do you know who could build on top of an API ?

Page 48: Europeana Newspapers -

That is the problem I leave you to discuss

Thank you.

http://www.theeuropeanlibrary.org/tel4/newspapers