rdm liasa webinar

63
Digital curation: why managing and sharing data matters to universities Sarah Jones and Joy Davidson Digital Curation Centre [email protected] [email protected] a LIASA HELIG webinar, 30 th April 2013, www.liasa.org.za/node/977

Upload: sarah-jones

Post on 11-May-2015

2.511 views

Category:

Technology


1 download

DESCRIPTION

Presentation given by Sarah Jones and Joy Davidson to a group of South African librarians at a webinar organised by LIASA HELIG. http://www.liasa.org.za/node/977

TRANSCRIPT

Page 1: RDM LIASA webinar

Digital curation: why managing and sharing data matters to universities

Sarah Jones and Joy DavidsonDigital Curation Centre

[email protected]@glasgow.ac.uk

a LIASA HELIG webinar, 30th April 2013, www.liasa.org.za/node/977

Page 2: RDM LIASA webinar

Digital Curation CentreJisc-funded consortium comprising units from the

– Universities of Bath (UKOLN)

– Edinburgh (DCC Centre)

– Glasgow (HATII)

Launched 1st March 2004 as a national centre for solving challenges in digital curation that could not be tackled by any single institution or discipline

Page 3: RDM LIASA webinar

Overview of session: four brief modules

1. Introduction to digital curation – how does research data management fit into the curation lifecycle?

2. Benefits and drivers for research data management

3. Review of current research data management activity in UK Universities

4. What role does the library have to play in research data management?

Page 4: RDM LIASA webinar

Please feel free to ask questions at any time!

• During the session you can ask questions. Simply type these into the chat box.

• Questions will be gathered and speakers will respond to selected questions at the end of each module.

• There will be a chance for additional questions at the end of the session.

Page 5: RDM LIASA webinar

DIGITAL CURATION, PRESERVATION AND RESEARCH DATA MANAGEMENT– AN INTRODUCTION

Page 6: RDM LIASA webinar

An introduction to digital curation• What is digital curation?

• What is the difference between curation, preservation and data management?

• What sort of activities are involved in digital curation?

• Who should be involved in digital curation?

6

Page 7: RDM LIASA webinar

“the active management and appraisal of data over the lifecycle of scholarly and scientific

interest”

Data have importance as the evidential base of scholarly conclusions

Curation is part of good research practice

What is data curation?

Page 8: RDM LIASA webinar

Are data curation, preservation and management different?

• Lots of different terms being used - are the they same or different?

• Essentially, they are all part of the curation lifecycle

Page 9: RDM LIASA webinar

Curation Lifecycle Model

Page 10: RDM LIASA webinar

Key questions to consider:• what data will be created? • how much storage is needed? • where will data be stored in the short and longer term?• are there ethical issues that require consent?

Many funders expect data management & sharing plans at the grant application stage!

Data Management Planning

Page 11: RDM LIASA webinar

Key questions to consider:

What information do users need to understand the data?

- descriptions of all variables / fields and their values - code labels, classification schema, abbreviations list- information about the project and data creators- tips on usage e.g. exceptions, quirks, questionable results

How will this capture this and who will capture/record it?

Are there standards that need to be followed?

Metadata & documentation

Page 12: RDM LIASA webinar

Key questions to consider:

• What data must be kept? (for validation, etc)

• What must not be kept? (e.g. personal data)

• Is it worth keeping the data? – cost/benefits

• Where will the data be kept?

Selecting what to keep

Page 13: RDM LIASA webinar

Storing data

Key questions to consider:

What amount of storage is available for the active phase?

What facilities are needed in the active phase?- remote access to work from home- file sharing with others- high-levels of security for sensitive data

How will the data be backed up?

Where will data be stored for the longer-term?

Page 14: RDM LIASA webinar

Institutional data repositories

Not intended to replace national, subject or other established data

collections

Acknowledge hybrid environment

http://datashare.is.ed.ac.uk

www.dspace.cam.ac.uk/https://databank.ora.ox.ac.uk

Essex-RDR and DataPool at Southampton

Page 15: RDM LIASA webinar

External data centres

Research funders’ data centres…

List of data centres: http://databib.org

Structured databases

Disciplinary& community initiatives

Page 16: RDM LIASA webinar

Finding and reusing data

Key questions to consider:

How can researchers make their data visible and citable?

Page 17: RDM LIASA webinar

Data catalogues

Develop a research dataextension to the cerif standard

JISC & DCC planning National coordination

http://cerif4datasets.wordpress.com

Page 18: RDM LIASA webinar

Who should be involved in curation?

Research Organisations

Funders

Data centresAdvisory bodies

Support services

Researchers

Publishers

Page 19: RDM LIASA webinar

BENEFITS AND DRIVERS– THE UK POLICY LANDSCAPE

Page 20: RDM LIASA webinar

“Data sets are becoming the

new instruments of science”

Dan Atkins, University of Michigan

Page 21: RDM LIASA webinar

Digital data as the new special

collections?

Sayeed Choudhury, Johns Hopkins

Page 22: RDM LIASA webinar

Research data: institutional

crown jewels?

http://www.flickr.com/photos/lifes__too_short__to__drink__cheap__wine/4754234186/

Page 23: RDM LIASA webinar

Expectations of public access

“Publicly funded research data are a public good, produced in the public interest, which should be

made openly available with as few restrictions as possible in a timely and responsible manner that

does not harm intellectual property.”

RCUK Common Principles on Data Policyhttp://www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

Page 24: RDM LIASA webinar

24http://www.bis.gov.uk/innovatingforgrowth

…open data

Page 25: RDM LIASA webinar

...personal data

Page 26: RDM LIASA webinar

Benefits of data sharing (1)

www.nytimes.com/2010/08/13/health/research/13alzheimer.html?pagewanted=all&_r=0

“It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.”

Dr John Trojanowski, University of Pennsylvania

... scientific breakthroughs

Page 27: RDM LIASA webinar

Benefits of data sharing (2)

www.guardian.co.uk/politics/2013/apr/18/uncovered-error-george-osborne-austerity

... validation of results

“It was a mistake in a spreadsheet that could have been easily overlooked: a few rows left out of an equation to average the values in a column.

The spreadsheet was used to draw the conclusion of an influential 2010 economics paper: that public debt of more than 90% of GDP slows down growth. This conclusion was later cited by the International Monetary Fund and the UK Treasury to justify programmes of austerity that have arguably led to riots, poverty and lost jobs.”

Page 28: RDM LIASA webinar

Benefits of data sharing (3)

“There is evidence that studies that make their data available do indeed receive more citations

than similar studies that do not.”

Piwowar H. and Vision T.J 2013 "Data reuse and the open data citation advantage“ https://peerj.com/preprints/1.pdf

9% - 30% increase

... more citations

Page 30: RDM LIASA webinar

“Research organisations will ensure that effective data curation is provided throughout the full data lifecycle,

with ‘data curation’ and ‘data lifecycle’ being as defined by the Digital Curation Centre. The full range of responsibilities associated with data curation over

the data lifecycle will be clearly allocated...”

www.epsrc.ac.uk/about/standards/researchdata/Pages/expectations.aspx

...institutional responsibility

Page 31: RDM LIASA webinar

Research funder data policies

www.dcc.ac.uk/resources/policy-and-legal/ overview-funders-data-policies

Page 32: RDM LIASA webinar

Ultimately funders expect:

• timely release of data- once patents are filed or on (acceptance for) publication

• open data sharing- minimal or no restrictions if possible

• preservation of data - typically 5-10+ years if of long-term value

See the RCUK Common Principles on Data Policy: www.rcuk.ac.uk/research/Pages/DataPolicy.aspx

Page 33: RDM LIASA webinar

Jisc MRD programmes

Managing Research Data programmes funded by the Jisc: • MRD 01: October 2009 – July 2011

– £4.3 million investment– www.jisc.ac.uk/whatwedo/programmes/mrd.aspx

• MRD 02 – October 2011 – July 2013– £4.6 million investment– www.jisc.ac.uk/whatwedo/programmes/di_researchmanagement/

managingresearchdata.aspx

Programme Manager: Simon Hodson [email protected]

Twitter: #jiscmrd

Page 34: RDM LIASA webinar

The DCC Mission

“Helping to build capacity, capability and skills in data management and curation

across the UK’s higher education research

community”

Phase 3 Business Plan

www.dcc.ac.uk

Page 35: RDM LIASA webinar

DCC Institutional Engagements

With funding from HEFCE we’re:

• Working intensively with 21 HEIs to increase RDM capability– 60 days of effort per HEI drawn from a mix of DCC staff– Deploy DCC & external tools, new approaches & best practice

• Support varies based on what each institution wants/needs

• Lessons & examples will be shared with the community

www.dcc.ac.uk/community/institutional-engagements

Page 36: RDM LIASA webinar

Some unis we are working with

Page 37: RDM LIASA webinar

Common DCC IE activities

• Establishing steering groups

• Making the case for RDM

• Assessing needs

• Developing policy and strategy

• Piloting tools

• Offering DMP consultations

• Delivering training

• Setting up guidance websites

• ...

Page 38: RDM LIASA webinar

CURRENT RDM INITIATIVES IN UK UNIVERSITIES

Page 39: RDM LIASA webinar

How to develop RDM services

Guide and case studies: www.dcc.ac.uk/resources/developing-rdm-services

Page 40: RDM LIASA webinar

Components of a research data service

Page 41: RDM LIASA webinar

Institutional RDM policies

www.dcc.ac.uk/resources/policy-and-legal/institutional-data-policies

Page 42: RDM LIASA webinar

Early research data policies

“Statement of commitment” Infrastructure policy

“10 commandments”mutual promises

aspirational

Baseline of RCUK Code+ procedures & support

legal tone / languagea section in uni DM policyuseful guide as appendix

Based on Edin. with a few additions

Page 43: RDM LIASA webinar

RDM strategies and roadmaps

A series of blog postswww.dcc.ac.uk/news

Links to example roadmapshttp://tiny.cc/EPSRCroadmaps

Page 44: RDM LIASA webinar

University of Bath RDM roadmap

• Based on Monash University RDM strategy• Identifies the current position and proposes activity• Defines roles and responsibilities and timeframes

http://www.bath.ac.uk/rdso/University-of-Bath-Roadmap-for-EPSRC.pdf

Page 45: RDM LIASA webinar

Guidance webpages

www.gla.ac.uk/datamanagement

www.bath.ac.uk/research/data

Page 47: RDM LIASA webinar

Online training for PhD students

http://datalib.edina.ac.uk/mantra

Page 48: RDM LIASA webinar

Data Management Planning support

• Guidelines / templates on what to include in plans

• Example answers, guidance and links to local support

• A library of successful DMPs to reuse

• Tailored consultancy services

• Online tools (e.g. customised DMPonline)

• Links / flags embedded in grant systems

• ...

Page 49: RDM LIASA webinar

Research data storage

Blue Peta at Bristol1st 5TB free per Data Steward then £400 per TB p.a. for disk storage; tape backup £40 per TB

http://data.bris.ac.uk

• £2m funding to date• Petascale facility – expandable• 3 machine rooms – resilience (tape archive 2012)• Available to all researchers for research data

Page 50: RDM LIASA webinar

Institutional data repositories

Not intended to replace national, subject or other established data

repositories

Acknowledge hybrid environment

http://datashare.is.ed.ac.uk

www.dspace.cam.ac.ukhttps://databank.ora.ox.ac.uk

Research Data at Essex and DataPool at Southampton

Page 51: RDM LIASA webinar

Data catalogues

Develop a research dataextension to the cerif standard

JISC & DCC planning National coordination

http://cerif4datasets.wordpress.com

Page 52: RDM LIASA webinar

Bringing it all together into a service

Diagram courtesy of Sally Rumsey, University of Oxford

Page 53: RDM LIASA webinar

THE ROLE OF THE LIBRARY – RE-SKILLING FOR DATA CURATION

Page 54: RDM LIASA webinar

How are libraries engaging in RDM?

Library

IT

ResearchOffice

The library is leading on most DCC institutional engagementswww.dcc.ac.uk/community/institutional-engagements

They are involved in: defining the institutional strategy developing RDM policy delivering training courses helping researchers to write DMPs advising on data sharing and citation setting up data repositories ...

Page 55: RDM LIASA webinar

Why should libraries support RDM?

• existing data and open access leadership roles

• often run publication repositories

• have good relationships with researchers

• proven liaison and negotiation skills

• knowledge of information management, metadata...

• highly relevant skill set

Page 56: RDM LIASA webinar

Possible Library RDM roles• Leading on local (institutional) data policy

• Bringing data into undergraduate research-based learning

• Teaching data literacy to postgraduate students

• Developing researcher data awareness

• Providing advice, e.g. on writing DMPs or advice on RDM within a project

• Explaining the impact of sharing data, and how to cite data

• Signposting who in the Uni to consult in relation to a particular question

• Auditing to identify data sets for archiving or RDM needs

• Developing and managing access to data collections

• Documenting what datasets an institution has

• Developing local data curation capacity

• Promoting data reuse by making known what is available

RDMRose Lite

Page 57: RDM LIASA webinar

Training for librarians• RDM for librarians, DCC

http://www.dcc.ac.uk/training/rdm-librarians

• RDMRose, University of Sheffieldhttp://rdmrose.group.shef.ac.uk

• Data Intelligence for librarians, 3TU, Netherlandshttp://dataintelligence.3tu.nl/en/about-the-course

• DIY Training Kit for Librarians, University of Edinburgh

http://datalib.edina.ac.uk/mantra/libtraining.html

• SupportDM modules, University of East Londonhttp://www.uel.ac.uk/trad/outputs/resources

Page 58: RDM LIASA webinar

RDM for Librarians

• 3 hour course by the DCC covering:– Research data and RDM– Data management planning– Data sharing– Skills– RDM at [INSERT YOUR UNI]

• Slides and accompanying handbook

• Used UKDA guide as pre-reading

• http://www.dcc.ac.uk/training/rdm-librarians

Page 59: RDM LIASA webinar

RDMRose

• Taught and CPD learning materials in RDM tailored for information professionals, by the Uni of Sheffield

• 8 sessions, each of which is half day of study

• Strong emphasis on practical hands-on activities

• Also offer a short (2hr) course called RDMRose Lite

• http://rdmrose.group.shef.ac.uk

Page 60: RDM LIASA webinar

Data Intelligence for Librarians

• A course produced by 3TU, a consortium of technical universities in the Netherlands

• Combination of online and face-to-face education

• Four meetings to learn and share knowledge

• Theory (on website) and assignments are conducted between sessions

• http://dataintelligence.3tu.nl/en/home

Page 61: RDM LIASA webinar

DIY Training Kit for Librarians• By EDINA and Data Library at University of Edinburgh

• Self-directed course, intended to be used by a group of librarians to build confidence in supporting researchers

• MANTRA modules as pre-reading, short presentation, reflective questions and exercises to guide discussion

• Five face-to-face sessions– Data Management Planning– Organising and documenting data– Data security and storage– Ethics and copyright– Data sharing

• http://datalib.edina.ac.uk/mantra/libtraining.html

Page 62: RDM LIASA webinar

SupportDM• By the TraD project at the University of East London

• SupportDM comprises five sessions– About research data management– Providing guidance and support for researchers– Data Management Planning– Selecting which data to keep– Cataloguing and sharing data

• Each topic is introduced in a face-to-face session and explored via exercises and discussion

• Learning is reinforced via an online tutorial and practical exercises to do before the next session

• http://www.uel.ac.uk/trad/outputs/resource

Page 63: RDM LIASA webinar

Thanks – any questions?

DCC guidance, tools and case studies:www.dcc.ac.uk/resources

Follow us on twitter: @digitalcuration and #ukdcc