toward long-lived data collection curation and management at iu robert h. mcdonald associate dean...

14
Toward Long-lived Data Collection Curation and Management at IU Robert H. McDonald Associate Dean for Library Technologies & Digital Libraries Associate Director-Data to Insight Center Pervasive Technology Institute [email protected]

Upload: leon-warren

Post on 01-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Toward Long-lived Data Collection Curation and Management at IU

Robert H. McDonaldAssociate Dean for Library Technologies & Digital

LibrariesAssociate Director-Data to Insight Center Pervasive

Technology [email protected]

OVERVIEW

o The Data Delugeo Digital Data Collection Curation and Historyo Agency Funding Perspectiveso IU Perspectives • DLP, ScholarWorks and Research Technologies

o Data Management Plans at IU• Joint Data Management Task Force

The Data Deluge – Information Drives Our Society

What is the potential impact of Global Warming?

How will natural disasters effect urban centers?

What therapies can be used to cure or

control cancer?What plants work best for biofuels?

Can we accurately predict market outcomes?

“Science is more essential for our prosperity, our security, our

health, our environment, and our quality of life than it has ever

been before.”

U.S. President Barack Obama

From Berman – Mobilizing the Data Deluge

Open Linked Data is Becoming More Important

Digital Data Collection Curation-Recent History

o 2003 – NSF Atkins Report on Revolutionizing Science and Engineering Through Cyberinfrastructure

o 2003 – NIH Releases Final Statement on Sharing Research Data o 2007 - The Association of Research Libraries (ARL) establishes the

ARL Joint Task Force on Library Support for E–Science o 2007 - The National Science Foundation (NSF) publishes in

January a report of the Sept-Oct 2006 NSF workshop, “History and Theory of Infrastructure: Lessons for New Scientific Cyberinfrastructures”

o 2007 - The Blue Ribbon Task Force on Sustainable Digital Preservation and Access (BRTF-SDPA) is funded by NSF and the Mellon Foundation, in partnership with the Library of Congress, the UK’s JISC, CLIR, and NARA.

Digital Data Collection Curation-Recent History

o 2007 - UIUC and Purdue, with support of Institute for Museum and Library Services (IMLS) funding, launch the Curation Profiles Project (2007-2009)

o 2007 - NSF publishes its proposal for DataNet on September 28, 2007, envisioning “new types of organizations [that] will integrate library and archival sciences, cyberinfrastructure, computer and information sciences, and domain science expertise."

o 2007 - JISC and the Mellon Foundation hold Workshop on Sharing and Curating Research Data, Washington, D.C., December 14, 2007.

o 2008 - The NSF’s National Science Board announces two major DataNet awards in December 2008 -one to DataONE and the other the Data Conservancy

o 2010 – NSF Funding Requirements for Data Management Plans

Agency Funding Perspectives

o NIH• First policies date back to 2003• Driven by clinical and translational medicine• Clear Guidelines

• http://grants.nih.gov/grants/policy/data_sharing/

o NSF• Driven by Interest in a Commons Model for Cyberinfrastucture• Creation of Office of Cyberinfrastructure• Funding of National Scale Data Centers (NCAR-NODC-NCDC-

NCGC-NSIDC-CDIAC)• Funding of DataNet Mass Curation Infrastructure• Data Management Plans as first step

• http://www.nsf.gov/bfa/dias/policy/dmp.jsp

IU Perspectives

o 2007 Report of the Indiana University Research Data Management Taskforce• http://hdl.handle.net/2022/3359

o Empowering People (http://ep.iu.edu)• Recommendation A1 - Cyberinfrastructure• Recommendation B9 – 30-42• Recommendation B15 – 70-72

o IU Blue Ribbon Panel on Data Management• Summer 2010-Fall 2010

o IU System Wide Policy Group Being Assembled now by the Office of the Vice-President for Research

o Early Tests• Digital Library Program – ScholarWorks - UITS Research Technologies

o E-Science Library Position Posted 2010

IU ScholarWorks/DLP/RT Data Preservation Workflow

IU MDSS

MDSS web server

HTTP Server

hpssfs filesystem

Item record with URL’s of datasets in

MDSS

Courtesy Dunn et al.

Data Management Plans at IU

Report from the IU Blue-Ribbon Data Management Task Force concerning NSF Data Management Plan Requirements

• Beth Plale, IUB SoIC and D2I PTI, Chairperson• Andrew Arenson, RT PTI• Julie Bobay, IUB Libraries• Geoffrey Brown, IUB SoIC• Alan R Burdette, ATM, IDAH• Casey, Michael T, ATM• Dennis J Cromwell, OVPIT• Jon Dunn, IUB DLP• Charles Dye, IUPUI Libraries• Stacy T Kowalczyk, D2I PTI• David Leake, IUB SoIC • David Lewis, IUPUI Libraries• Scott Long, IUB OVPR• Robert McDonald, IUB libraries• Kristi Palmer, IUPUI Libraries• Richard Repasky, RT PTI• Kurt Siefert, RT PTI• Craig Stewart, RT PTI• Ruth Stone, IUB OVPR• Joshua Sullivan, IUPUI Libraries• Eric A Wernert, RT & D2I PTI

http://kb.iu.edu/data/anwu.html

IU Joint Data Management Task Force Highlights

o Recommendations• Data Creator• Data Archive• Data Storage• Data Sharing• Infrastructure• Funding

o IU researchers should be encouraged to first look to deposit their data in a domain specific national data center. If that is not feasible, then the researcher should look to IU for that capability.

o IU researchers license their research data under a license that grants rights to others to use the research data in any way. This could be through the Open Data Commons Public Domain Dedication and License (PDDL) or other entity.

o Develop a set of educational resources on data management specific to IU Researchers

Stewardship of Data CollectionsFrom Berman – Mobilizing the Data Deluge

o Key Questions:

1) What should we save? –

community perceptions of

value and responsibility

2) How should we save it? –

technology, best practice,

policy

3) How can we sustain

valuable data? –

economics

Cost

Time

Value

Key candidates for preservation

Referenceso Association of Research Libraries, Association of American Universities, Coalition for Networked Information, and

National Association of State Universities and Land-Grant Colleges. 2009. The university's role in the dissemination of research and scholarship.

o Atkins, D. 2003. A report from the U.S. National Science Foundation Blue Ribbon Panel on Cyberinfrastructure. Arlington, VA: Directorate for Computer & Information Science & Engineering of the National Science Foundation.

o Berman et al. Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information, Final Report of the Blue Ribbon Task Force on Sustainable Digital Preservation and Access, 2010. http://brtf.sdsc.edu/

o The Fourth Paradigm: Data-Intensive Scientific Discovery, Edited by Tony Hey, Stewart Tansley, and Kristin Tolle, 2009

o Gray, J., Szalay, A. S., Thakar, A. R., & Stoughton, C. 2002. Online Scientific Data Curation, Publication, and Archiving. Redmond, WA. Retrieved from http://arxiv.org/abs/cs.DL/0208012.

o Hedstrom, M. & S. Montgomery (1998). Digital Preservation Needs and Requirements in RLG Member Institutions. Mountainview, Calif.: Record Library Group. Retrieved on May 11, 2004 from http://www.rlg.org/preserv/digpres.html

o Lessig, Lawrence 2010. Getting Our Values around Copyright, Educause, Vol. 45(2), March/April 2010 o Jensen, S. and B. PlaleExtended

Abstract: Schema-Independent and Schema-Friendly Scientific Metadata Management, 4th International IEEE Conference on e-Science, Indianapolis, IN Dec 2008.

o NSF Data Management & Sharing Frequently Asked Questions (FAQs) http://www.nsf.gov/bfa/dias/policy/dmpfaqs.jsp

o NSF Dissemination and Sharing of Research Results http://www.nsf.gov/bfa/dias/policy/dmp.jsp

Questions/Discussion

o Contact• Robert H. McDonald• [email protected]• rhmcdona on Unicom (Office

Communicator)• mcdonald on twitter