through or around? scientific research data and the institutional repository
DESCRIPTION
Through or Around? scientific Research Data and the Institutional Repository. Panel Presentation for the International Conference on University Libraries Universidad Nacional Autónoma de México November 6, 2013 Christopher Stewart, Ed.D . Assistant Professor - PowerPoint PPT PresentationTRANSCRIPT
THROUGH OR AROUND? SCIENTIFIC RESEARCH DATA AND THE INSTITUTIONAL REPOSITORYPanel Presentation for the International Conference on University LibrariesUniversidad Nacional Autónoma de MéxicoNovember 6, 2013
Christopher Stewart, Ed.D.Assistant Professor Graduate School of Library and Information ScienceDominican University
Enabling Access to Research DataNot a new issue for universities and academic libraries, but rapidly developing one…
Data Archiving RequirementsAgency Nation/RegionNSF United States
NIH United States
INSPIRE European Union
UK Research Council United Kingdom
ARC Australia
EUR-OCEANS France
CIHR Canada
FODAZIONE CARIPLO Italy
Source: SHERPA/JULIET
Expanding the MandateU.S. Office of Science and Technology Policy directive, 2/22/2013*
*Requires each Federal agency with over $100 million in annual conduct of research and development expenditures to develop a framework for awardees.
Research Data can be:• Heterogeneous• Unless accompanying publication, often “raw”• Highly idiosyncratic • Characterized by implied description rather than explicit
description• Small and big
Big Data can be:• Unstructured• Unsuited for traditional (e.g., hierarchical, relational)
database models• Complete, not sampled• Linked
Goals for Describing Scientific Research Data
• Access• Re-use• Context• Content, not container (Yarmey, 2013)
Research Lifecycle
Source: University of Virginia Library, Data Consulting Group
Describing Scientific Research Data: Semantic Modeling • Shared vocabularies
provide metadata across a range of subjects
• Ontologies allow for contextual relationships
• Linked data enable multiple types of data, documents, etc. to be viewed as one database
Data Description Schemes (Greenberg, 2013)
• Simple: interoperable, easy to generate, low barrier, multidisciplinary, agnostic, flat, general, 15-25 properties
• Simple/moderate: interoperability with specific needs, requires expertise and greater domain focus, extensible, granular
• Complex: hierarchical and granular, domain-centered, extensive, 100+ properties
Are Research Data Collections?• Selecting: partially, though volume and scope of data
challenge current digital collection development frameworks
• Acquiring: partially, though data not “owned”• Describing: yes, although some content may reside
elsewhere• Organizing: yes, but with not with “traditional” IR
taxonomies
How Academic Libraries are Working with Research Data Now• Institutional repositories are about all types of data, but
are clearly set-up for research publications (Salo, 2010)• Most institutional repositories rely on Dublin Core, which
is required as minimum operability by OAI-PMH, but most research and exchange standards use XML/RDF as base (Salo, 2010)
• Geared for output, not context
Primary Metadata Use in Institutional Repositories
Standard Percent of Use Dublin Core 68%
OAI-PMH 46%
MARC 40%
Source: Simons & Richardson, 2012
Challenges for Current Data Curation Models in Academic Libraries• Beyond metadata at project level, dataset level provides
some context for data, but can be limited (Yarmey, 2013)• Discoverability in institutional repositories is generally
limited to library websites, catalogs, and Google Scholar (Burns, Lana, & Budd, 2013)
Content in Institutional Repositories Content Type Number of
Repositories Holding
Response Rate
Courseware 14 31%
Data sets 23 51%
Other 25 56%
Books 29 64%
Book chapters 35 78%
Tech reports, working papers 39 87%
Conference articles 40 89%
Presentations 41 91%
Theses and dissertations 43 43%
Journal articles 44 44%
Source: Burns, S. L., Lana, A., & Budd, J. M. (2013). Institutional Repositories: Exploration of Costs and Value. D-Lib Magazine, 19(1/2). Retrieved from http://www.dlib.org/dlib/january13/burns/01burns.html
Domain Repositories• Existing and developing metadata
standards (e.g., Dryrad/DCAM, ICPSR/DDI)
• Centralized or distributed (e.g., DataONE)
• Evidence suggests that scholars who deposit materials in subject repositories prefer them over institutional repositories, and are not likely to use both (Xia, 2008)
• Built around communities of interest • Cost sharing for cloud services
Data Management: Education and Programming Opportunities for Academic Libraries • Training and support for
data management plans• Data librarianship• Data literacy
An Evolving ModelSubject/Domain Data Repository Institutional Repository“Raw” data Published data
Linked Hierarchical
Open Data Open Access
Complex description Basic description
Multi-type data Multi-type documents
References• Burns, S. L., Lana, A., & Budd, J. M. (2013). Institutional Repositories: Exploration of Costs and Value. D-Lib
Magazine, 19(1/2). Retrieved from http://www.dlib.org/dlib/january13/burns/01burns.html• Greenberg, J. (2012, August 22). Metadata for Managing Scientific Research Data. Presented at the NISO/DCMI
Webinar. Retrieved from http://dublincore.org/resources/training/• Salo, D. (2010). Retooling Libraries for the Data Challenge | Ariadne: Web Magazine for Information Professionals.
Ariadne, (64). Retrieved from http://www.ariadne.ac.uk/issue64/salo• Simons, N., & Richardson, J. (2012). “New Roles, New Responsibilities: Examining Training Needs of Repository”
by Natasha Simons and Joanna Richardson. Journal of Librarianship and Scholarly Communication, 1(2). Retrieved from http://jlsc-pub.org/jlsc/vol1/iss2/7/
• Xia, J. (2008). A Comparison of Subject and Institutional Repositories in Self-Archiving Practices. Journal of Academic Librarianship, 34(6), 489–495.
• Yarmey, K. A., & Yarmey, L. R. (2013). All in the Family: A Dinner Table Conversation about Libraries, Archives, Data, and Science - Archive Journal Issue 3. Archive Journal, Summer 2013(3). Retrieved from http://www.archivejournal.net/issue/3/archives-remixed/all-in-the-family-a-dinner-table-conversation-about-libraries-archives-data-and-science/
Image Credits• Slide 7:
http://www.newsrewired.com/2010/11/16/links-what-is-linked-data-and-why-does-it-matter-to-journalists-and-publishers/
• Slide 10: http://dmconsult.library.virginia.edu/