rj broker: automating delivery of research output to repositories
DESCRIPTION
Presentation during Webinar given by Muriel Mewissen of EDINA, 29 May 2013.TRANSCRIPT
RJ Broker: Automating Delivery of Research Output to Repositories
Muriel Mewissen - EDINA
RSP Webinar - 29 May 20131
Overview
• Need for a broker
• Development of the RJ Broker
• Publisher & Subject Repository Trials
• Future
• Conclusion
RSP Webinar - 29 May 20132
The need for a Broker
RSP Webinar - 29 May 20133
Focus
Support Open Access & Funder mandates:
• To increase the number of deposits to UK repositories
• To minimise effort by depositors and IR managers
Institutional Repo
Subject repo
Publishers
Author 2 / Representative
Funders
Publisher System
IR
IR
IR
Author/PI
RSP Webinar - 29 May 20134
Broker
Development of the RJ Broker
RSP Webinar - 29 May 20135
Previous & Current Projects
• EDINA led several projects in the repositories area since 2006:
– Prospero, The Depot, OpenDepot.org, EM-Loader, OA-RJ, UK RepositoryNet+ (RepNet), ORI and RJ Broker
– http://edina.ac.uk/projects/
• RJ Broker
– April 2012 to March 2013
– Extension to July 2013
– Component of the RepNet infrastructure
RSP Webinar - 29 May 20136
Abstract Model: A Delivery Service
• RJ Broker is a delivery service for research output
– Deposit: parcel or letter
– Metadata: address
– Notification: postcard
• Vision:
– http://oarepojunction.wordpress.com/2013/01/10/rj-
broker-a-research-output-delivery-service/
RSP Webinar - 29 May 20137
RJ Broker
• RJ Broker is an independent middleware tool
– Accept deposit of research articles
NLM DTD, bespoke format, SWORD, Eprints
– Process the deposits into a common format
RJ Broker code
– Identify target repositories from metadata
Organisation and Repository Identification (ORI) http://ori.edina.ac.uk/
– Handle deposition to registered repositories
SWORD, plugins (Eprints, DSpace,Fedora,…)
– Provide tracking ID to content supplier
URIs
RSP Webinar - 29 May 20138
RJ Broker
• RJ Broker also
– Allow browsing, search and download
GUI & APIs (Eprints)
– Notify other repositories with relevant content
Monthly email
“View”
useful for non SWORD systems (CRIS), individuals
RSP Webinar - 29 May 20139
Proposed RJ Broker Service
RSP Webinar - 29 May 201310
Publisher & Subject Repository Trials with the Pilot RJ Broker
RSP Webinar - 29 May 2013
11
Pilot RJ Broker
• Demonstrate the functionality
• Real data
• Test the scalability
• Publisher: Nature Publishing Group (NPG)
• Subject Repository: Europe PubMed Central
RSP Webinar - 29 May 2013
12
Publisher: NPG
• Record includes
– Metadata: rich, embargo, funder, multiple authors, ORCID in the future…
– Content: Multi-part publication (some content may be embargoed) i.e. full text
• Development work:
– Agree format for the record (NLM DTD based)
– EDINA developed an importer for the data
– Transfer using SWORD 1.3
– NPG added new stream in their publication workflow that send data to the RJ Broker
RSP Webinar - 29 May 201313
NPG
• Legal agreements to respect embargo periods
– Between NPG & EDINA
– Between EDINA & IRs
• MIT signed the IR agreement
– Working on data importer for DSpace
• Worth considering to receive:
– Quality: Full text publication & rich metadata
– Timely: Straight from the publisher during the publication process even if embargoed
• Template agreement on request
RSP Webinar - 29 May 201314
NPG
• Set up took several months
– Time difference
– Relies on voluntary participation
– Requires small amount of development work
– Legal framework
• Successful data transfer trial between NPG & RJ Broker in February 2013
• NPG ready to start continuous data feed
– A couple of journals first to increase with take up
• Transfer to test IRs
RSP Webinar - 29 May 201315
Subject Repository: Europe PMC
• Use case supported by Jisc, RepNet & Wellcome Trust
– UK focus
– Support funders mandate
• Record includes:
– Metadata only: funders, grant numbers, first author only, DOI to full text…
– Fully Open Access
– New publication or Update to existing publication
RSP Webinar - 29 May 201316
Europe PMC
• Development work:
– Agree format for the record (bespoke)
– EDINA developed an importer for the data
– Transfer using SWORD 1.3
– MIMAS/EBI get regular data feed from PMC
– Push data from their regular feed to the RJ Broker
• Set up took a few weeks
• Successful data transfer trial between Europe PMC & RJ Broker in February 2013
• Ready to start continuous data feed
– Average 160,000 records per month
• Transfer to test IRs
RSP Webinar - 29 May 201317
Europe PMC Trial in Numbers
~67,000
~60,000
~58,500
~22,500
~14,500
1,665
RSP Webinar - 29 May 201318
67,000 records in the trial dataset(~12 days based on an average 160,000 per month)
7,000 no affiliation 60,000 sent to RJ Broker
1,500 errors (bad format) 58,500 successfully received by RJ Broker
36,000 with no identifiable organisation
RJ Broker identifiesorganisation for 22,500
8,000 no repositories 14,500 have repositories
13,000 worldwide (not UK) 1,665 in the UK
Europe PMC Trial in Numbers
RSP Webinar - 29 May 201319
Europe PMC Trial in Numbers
RSP Webinar - 29 May 2013
20
Number of associated repositories for records with one organisation identified
Europe PMC Trial in Numbers
RSP Webinar - 29 May 201321
Country
Code
Country Number of
records
us USA 5934
gb United
Kingdom
1665
ca Canada 1099
jp Japan 722
au Australia 655
se Sweden 313
es Spain 304
nl Netherlands 299
de Germany 239
tw Taiwan 181
fr France 180
br Brazil 179
it Italy 176
be Belgium 174
th Thailand 168
za South Africa 160
sd Sudan 155
55 other countries with
less than 1% of records
each
1836
Top UK Institutions Destination Number of records
University of Oxford 170
University of Cambridge 139
University College London 119
Imperial College 103
University of Edinburgh 88
University of Manchester 63
University of Bristol 61
University of Nottingham, University of Newcastle Upon Tyne 56
Liverpool 55
University of Glasgow 52
RSP Webinar - 29 May 201322
Europe PMC Trial in Numbers
78 UK Institutions in total
Europe PMC Trial in Numbers
RSP Webinar - 29 May 201323
RJ Broker Trial Installation
• GUI preview access
• OA records from NPG & Europe PMC are available for browsing & downloading
– Check what we have for your institution!
– http://devel.edina.ac.uk:1203/
– !!! It is only trial & development installation
– !!! Not a service yet
RSP Webinar - 29 May 201324
RJ Broker Trial
Demonstrate features:
• Importing records from different suppliers
• Storing & Processing (~2s per record)
• Repository Identification
• Delivery
• Browsing & Download
More end-to-end use cases
• External IRs
• Different IR platforms (Eprints, DSpace, Fedora…)
RSP Webinar - 29 May 201325
The Future
RSP Webinar - 29 May 2013
26
Immediate Future
• Project extension (31 July 2013)
• Prepare transition to service
– Service installation
– Add functionality
• Email notification to all (non-registered) IRs
• Improve support for different repository platforms
• Bulk transfer of data backlog
• Support RIOXX metadata export
– Early adopters
• IRs
• Data suppliers to establish data feeds
– Start building data store
• Content kept for 1 year to start with
RSP Webinar - 29 May 201327
Future (after July 2013)
• Transition to Service
– Within RepNet Infrastructure
– SLD
– Roadmap for adding further functionality
• Open for recruitment
RSP Webinar - 29 May 201328
Proposed Service to IRs
• All IRs:
– Browse and download OA content through public APIs and GUI to the RJ Broker
– Ready to accept registration of new IRs
– Info „pack‟ on how to register to the RJ Broker delivery service
– Monthly email notification to IRs with a list of citations for the OA publications which have been supplied to the RJ Broker in the last month and are relevant to the institution, includes instruction on how to access the APIs and GUI for browsing and downloading the content and an invitation to register to the RJ Broker deliver service.
RSP Webinar - 29 May 201329
Proposed Service to IRs
• Registration process:
– IR provides SWORD endpoint credentials to RJ Broker
– IR is configured to accept RJ Broker data
– Option to opt-in to receive embargo content requires to sign a legal agreement
• Registered IRs:
– Direct delivery of new OA content from all suppliers to IR sword endpoint
– Direct delivery of embargoed content from all suppliers to IR sword endpoint subject to legal agreement
RSP Webinar - 29 May 201330
Proposed Service to Data Suppliers
• Ready to accept new data suppliers
• Info „pack‟ on how to become an RJ Broker data supplier
• Enable regular data feed into RJ Broker
• Forward supplied content to registered IRs (when identification of target IRs is possible)
• Email notification of supplied content to relevant IRs
• Expose supplied OA content through APIs and GUI for browsing and downloading
• Allow tracking and full access to own content
RSP Webinar - 29 May 201331
Conclusion
RSP Webinar - 29 May 201332
Conclusion
• Effective solution to content dissemination
• Benefits all
– Increase OA & transparency
– Help with reporting
– Support promotion
– Saves time & effort (money)
• Appeal of service will grow
• Small amount of development work needed locally but it is worth it!
RSP Webinar - 29 May 201333
Thanks
• You
• RSP
• EDINA Team
– Ian Stuart - Cesare Bellini
– Muriel Mewissen - Christine Rees
– Peter Burnhill - Theo Andrews
• NPG, MIT, Europe PMC, MIMAS, EBI, Wellcome Trust
• UK RepositoryNet+
• Jisc
RSP Webinar - 29 May 201334