rj broker: automating delivery of research output to repositories

34
RJ Broker: Automating Delivery of Research Output to Repositories Muriel Mewissen - EDINA RSP Webinar - 29 May 2013 1

Upload: edina-university-of-edinburgh

Post on 02-Jul-2015

326 views

Category:

Education


1 download

DESCRIPTION

Presentation during Webinar given by Muriel Mewissen of EDINA, 29 May 2013.

TRANSCRIPT

Page 1: RJ Broker: Automating Delivery of Research Output to Repositories

RJ Broker: Automating Delivery of Research Output to Repositories

Muriel Mewissen - EDINA

RSP Webinar - 29 May 20131

Page 2: RJ Broker: Automating Delivery of Research Output to Repositories

Overview

• Need for a broker

• Development of the RJ Broker

• Publisher & Subject Repository Trials

• Future

• Conclusion

RSP Webinar - 29 May 20132

Page 3: RJ Broker: Automating Delivery of Research Output to Repositories

The need for a Broker

RSP Webinar - 29 May 20133

Page 4: RJ Broker: Automating Delivery of Research Output to Repositories

Focus

Support Open Access & Funder mandates:

• To increase the number of deposits to UK repositories

• To minimise effort by depositors and IR managers

Institutional Repo

Subject repo

Publishers

Author 2 / Representative

Funders

Publisher System

IR

IR

IR

Author/PI

RSP Webinar - 29 May 20134

Broker

Page 5: RJ Broker: Automating Delivery of Research Output to Repositories

Development of the RJ Broker

RSP Webinar - 29 May 20135

Page 6: RJ Broker: Automating Delivery of Research Output to Repositories

Previous & Current Projects

• EDINA led several projects in the repositories area since 2006:

– Prospero, The Depot, OpenDepot.org, EM-Loader, OA-RJ, UK RepositoryNet+ (RepNet), ORI and RJ Broker

– http://edina.ac.uk/projects/

• RJ Broker

– April 2012 to March 2013

– Extension to July 2013

– Component of the RepNet infrastructure

RSP Webinar - 29 May 20136

Page 7: RJ Broker: Automating Delivery of Research Output to Repositories

Abstract Model: A Delivery Service

• RJ Broker is a delivery service for research output

– Deposit: parcel or letter

– Metadata: address

– Notification: postcard

• Vision:

– http://oarepojunction.wordpress.com/2013/01/10/rj-

broker-a-research-output-delivery-service/

RSP Webinar - 29 May 20137

Page 8: RJ Broker: Automating Delivery of Research Output to Repositories

RJ Broker

• RJ Broker is an independent middleware tool

– Accept deposit of research articles

NLM DTD, bespoke format, SWORD, Eprints

– Process the deposits into a common format

RJ Broker code

– Identify target repositories from metadata

Organisation and Repository Identification (ORI) http://ori.edina.ac.uk/

– Handle deposition to registered repositories

SWORD, plugins (Eprints, DSpace,Fedora,…)

– Provide tracking ID to content supplier

URIs

RSP Webinar - 29 May 20138

Page 9: RJ Broker: Automating Delivery of Research Output to Repositories

RJ Broker

• RJ Broker also

– Allow browsing, search and download

GUI & APIs (Eprints)

– Notify other repositories with relevant content

Monthly email

“View”

useful for non SWORD systems (CRIS), individuals

RSP Webinar - 29 May 20139

Page 10: RJ Broker: Automating Delivery of Research Output to Repositories

Proposed RJ Broker Service

RSP Webinar - 29 May 201310

Page 11: RJ Broker: Automating Delivery of Research Output to Repositories

Publisher & Subject Repository Trials with the Pilot RJ Broker

RSP Webinar - 29 May 2013

11

Page 12: RJ Broker: Automating Delivery of Research Output to Repositories

Pilot RJ Broker

• Demonstrate the functionality

• Real data

• Test the scalability

• Publisher: Nature Publishing Group (NPG)

• Subject Repository: Europe PubMed Central

RSP Webinar - 29 May 2013

12

Page 13: RJ Broker: Automating Delivery of Research Output to Repositories

Publisher: NPG

• Record includes

– Metadata: rich, embargo, funder, multiple authors, ORCID in the future…

– Content: Multi-part publication (some content may be embargoed) i.e. full text

• Development work:

– Agree format for the record (NLM DTD based)

– EDINA developed an importer for the data

– Transfer using SWORD 1.3

– NPG added new stream in their publication workflow that send data to the RJ Broker

RSP Webinar - 29 May 201313

Page 14: RJ Broker: Automating Delivery of Research Output to Repositories

NPG

• Legal agreements to respect embargo periods

– Between NPG & EDINA

– Between EDINA & IRs

• MIT signed the IR agreement

– Working on data importer for DSpace

• Worth considering to receive:

– Quality: Full text publication & rich metadata

– Timely: Straight from the publisher during the publication process even if embargoed

• Template agreement on request

RSP Webinar - 29 May 201314

Page 15: RJ Broker: Automating Delivery of Research Output to Repositories

NPG

• Set up took several months

– Time difference

– Relies on voluntary participation

– Requires small amount of development work

– Legal framework

• Successful data transfer trial between NPG & RJ Broker in February 2013

• NPG ready to start continuous data feed

– A couple of journals first to increase with take up

• Transfer to test IRs

RSP Webinar - 29 May 201315

Page 16: RJ Broker: Automating Delivery of Research Output to Repositories

Subject Repository: Europe PMC

• Use case supported by Jisc, RepNet & Wellcome Trust

– UK focus

– Support funders mandate

• Record includes:

– Metadata only: funders, grant numbers, first author only, DOI to full text…

– Fully Open Access

– New publication or Update to existing publication

RSP Webinar - 29 May 201316

Page 17: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC

• Development work:

– Agree format for the record (bespoke)

– EDINA developed an importer for the data

– Transfer using SWORD 1.3

– MIMAS/EBI get regular data feed from PMC

– Push data from their regular feed to the RJ Broker

• Set up took a few weeks

• Successful data transfer trial between Europe PMC & RJ Broker in February 2013

• Ready to start continuous data feed

– Average 160,000 records per month

• Transfer to test IRs

RSP Webinar - 29 May 201317

Page 18: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC Trial in Numbers

~67,000

~60,000

~58,500

~22,500

~14,500

1,665

RSP Webinar - 29 May 201318

67,000 records in the trial dataset(~12 days based on an average 160,000 per month)

7,000 no affiliation 60,000 sent to RJ Broker

1,500 errors (bad format) 58,500 successfully received by RJ Broker

36,000 with no identifiable organisation

RJ Broker identifiesorganisation for 22,500

8,000 no repositories 14,500 have repositories

13,000 worldwide (not UK) 1,665 in the UK

Page 19: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC Trial in Numbers

RSP Webinar - 29 May 201319

Page 20: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC Trial in Numbers

RSP Webinar - 29 May 2013

20

Number of associated repositories for records with one organisation identified

Page 21: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC Trial in Numbers

RSP Webinar - 29 May 201321

Country

Code

Country Number of

records

us USA 5934

gb United

Kingdom

1665

ca Canada 1099

jp Japan 722

au Australia 655

se Sweden 313

es Spain 304

nl Netherlands 299

de Germany 239

tw Taiwan 181

fr France 180

br Brazil 179

it Italy 176

be Belgium 174

th Thailand 168

za South Africa 160

sd Sudan 155

55 other countries with

less than 1% of records

each

1836

Page 22: RJ Broker: Automating Delivery of Research Output to Repositories

Top UK Institutions Destination Number of records

University of Oxford 170

University of Cambridge 139

University College London 119

Imperial College 103

University of Edinburgh 88

University of Manchester 63

University of Bristol 61

University of Nottingham, University of Newcastle Upon Tyne 56

Liverpool 55

University of Glasgow 52

RSP Webinar - 29 May 201322

Europe PMC Trial in Numbers

78 UK Institutions in total

Page 23: RJ Broker: Automating Delivery of Research Output to Repositories

Europe PMC Trial in Numbers

RSP Webinar - 29 May 201323

Page 24: RJ Broker: Automating Delivery of Research Output to Repositories

RJ Broker Trial Installation

• GUI preview access

• OA records from NPG & Europe PMC are available for browsing & downloading

– Check what we have for your institution!

– http://devel.edina.ac.uk:1203/

– !!! It is only trial & development installation

– !!! Not a service yet

RSP Webinar - 29 May 201324

Page 25: RJ Broker: Automating Delivery of Research Output to Repositories

RJ Broker Trial

Demonstrate features:

• Importing records from different suppliers

• Storing & Processing (~2s per record)

• Repository Identification

• Delivery

• Browsing & Download

More end-to-end use cases

• External IRs

• Different IR platforms (Eprints, DSpace, Fedora…)

RSP Webinar - 29 May 201325

Page 26: RJ Broker: Automating Delivery of Research Output to Repositories

The Future

RSP Webinar - 29 May 2013

26

Page 27: RJ Broker: Automating Delivery of Research Output to Repositories

Immediate Future

• Project extension (31 July 2013)

• Prepare transition to service

– Service installation

– Add functionality

• Email notification to all (non-registered) IRs

• Improve support for different repository platforms

• Bulk transfer of data backlog

• Support RIOXX metadata export

– Early adopters

• IRs

• Data suppliers to establish data feeds

– Start building data store

• Content kept for 1 year to start with

RSP Webinar - 29 May 201327

Page 28: RJ Broker: Automating Delivery of Research Output to Repositories

Future (after July 2013)

• Transition to Service

– Within RepNet Infrastructure

– SLD

– Roadmap for adding further functionality

• Open for recruitment

RSP Webinar - 29 May 201328

Page 29: RJ Broker: Automating Delivery of Research Output to Repositories

Proposed Service to IRs

• All IRs:

– Browse and download OA content through public APIs and GUI to the RJ Broker

– Ready to accept registration of new IRs

– Info „pack‟ on how to register to the RJ Broker delivery service

– Monthly email notification to IRs with a list of citations for the OA publications which have been supplied to the RJ Broker in the last month and are relevant to the institution, includes instruction on how to access the APIs and GUI for browsing and downloading the content and an invitation to register to the RJ Broker deliver service.

RSP Webinar - 29 May 201329

Page 30: RJ Broker: Automating Delivery of Research Output to Repositories

Proposed Service to IRs

• Registration process:

– IR provides SWORD endpoint credentials to RJ Broker

– IR is configured to accept RJ Broker data

– Option to opt-in to receive embargo content requires to sign a legal agreement

• Registered IRs:

– Direct delivery of new OA content from all suppliers to IR sword endpoint

– Direct delivery of embargoed content from all suppliers to IR sword endpoint subject to legal agreement

RSP Webinar - 29 May 201330

Page 31: RJ Broker: Automating Delivery of Research Output to Repositories

Proposed Service to Data Suppliers

• Ready to accept new data suppliers

• Info „pack‟ on how to become an RJ Broker data supplier

• Enable regular data feed into RJ Broker

• Forward supplied content to registered IRs (when identification of target IRs is possible)

• Email notification of supplied content to relevant IRs

• Expose supplied OA content through APIs and GUI for browsing and downloading

• Allow tracking and full access to own content

RSP Webinar - 29 May 201331

Page 32: RJ Broker: Automating Delivery of Research Output to Repositories

Conclusion

RSP Webinar - 29 May 201332

Page 33: RJ Broker: Automating Delivery of Research Output to Repositories

Conclusion

• Effective solution to content dissemination

• Benefits all

– Increase OA & transparency

– Help with reporting

– Support promotion

– Saves time & effort (money)

• Appeal of service will grow

• Small amount of development work needed locally but it is worth it!

RSP Webinar - 29 May 201333

Page 34: RJ Broker: Automating Delivery of Research Output to Repositories

Thanks

• You

• RSP

• EDINA Team

– Ian Stuart - Cesare Bellini

– Muriel Mewissen - Christine Rees

– Peter Burnhill - Theo Andrews

• NPG, MIT, Europe PMC, MIMAS, EBI, Wellcome Trust

• UK RepositoryNet+

• Jisc

RSP Webinar - 29 May 201334