using dataverse virtual archive technology for research data management

36
Jonathan Crabtree Cheryl Thompson Using Dataverse Virtual Archive Technology for Research Data Management

Upload: gary-wilhelm

Post on 30-May-2015

424 views

Category:

Education


0 download

DESCRIPTION

One of the most important components of research is access to quality data. Digital data archives must work to increase submission rates to insure that quality data exist for future researchers. This is a challenge given that recent studies show that vast amounts of data collected during publicly funded projects are not being archived. Even the best-planned methodology will not succeed when researchers use tainted data or fail to find adequate data. Social science data archivists play a key role in the effort to maintain quality sources of data for social science investigators to repurpose and reuse. The dynamic, circular movement of data between the producers and archives is critical to the future of social science research. Data archives have historically provided for this data interchange using considerable human capital. Dedicated archivists and investigators have worked together to ensure that data were processed and placed into an archive best designed for their preservation, a manual process that has become increasingly expensive and unwieldy due to the volume of data being produced and the advanced metadata required to provide future researchers enough details to reuse the study. Typical methods have the researchers working with the archives to deposit the data long after the project has been complete and the papers published. The manual creation of metadata at this point takes far long than if it were collected earlier in the research life cycle. Recent advances in archival repository software may be the key to streamlining this increasingly inefficient archival process by allowing archivist and researchers the ability to create detailed metadata earlier in the research lifecycle at a point where it will take far less time. Software allows researchers greater personal control over archival ingest processes, bridging the gap between researchers and archives and possibly increasing submission rates of valuable data to archives. Archival technology provides tools that manage automated ingest, data cataloging, advanced search and indexing, and rights and access issues. Archival tools also provide proper citation, creation of persistent identifiers, automatic creation of preservation formats, format migration, and statistical analysis of data. Customized branding and citation management can provide investigators collecting these data with a tool that will ensure that they get the credit they deserve. The Dataverse Network Technology has the potential to aid many research groups at UNC in the data management processes and has the potential for use in many disciplines. This presentation will explain the technology and its applicability for managing research data.

TRANSCRIPT

Page 1: Using Dataverse Virtual Archive Technology for Research Data Management

Jonathan CrabtreeCheryl Thompson

Using Dataverse Virtual Archive Technology

for Research Data Management

Page 2: Using Dataverse Virtual Archive Technology for Research Data Management

OutlineOverview of Odum and issues around data managementConcepts around Dataverse and federated data systemsA look into Dataverse Virtual ArchivesFeatures of the Dataverse NetworkBenefits to Researchers & IT providers Exploring new possibilities

Page 3: Using Dataverse Virtual Archive Technology for Research Data Management

H. W. Odum Institute Archive Services• The Howard W. Odum Institute was founded in 1924.

• It is the oldest multidisciplinary social science university institute.

• Odum Archive Services is host to the third largest catalog of machine-readable social science data in the U.S. 

• Founding member of Data-PASS

• Founding member of The Library of Congress NDSA

• The Odum Dataverse Network (DVN) catalog includes polling, census, and other social science and health-related data. 

Page 4: Using Dataverse Virtual Archive Technology for Research Data Management

The ProblemDifferent needs for archives, data libraries,

researchers, journals, funding agencies…

We should preserve the

data

We should preserve the

data

I want credit for my data

I want credit for my data

We need persistent

links

We need persistent

links I need a Data Management

Plan

I need a Data Management

Plan

No publications without data

No publications without data

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 5: Using Dataverse Virtual Archive Technology for Research Data Management

Odum’s SolutionDataverse Network: centralized

professional archiving with distributed control and recognition

Cross, M. Why the Dataverse Network? Available at: thedata.org

•Persistent identifiers•Fixity•Backups & recovery•Metadata standards•Conversion standards•Preservation standards

•Persistent identifiers•Fixity•Backups & recovery•Metadata standards•Conversion standards•Preservation standards

•Branding & visibility•Data discovery•Ease of use•Scholarly citation•Control over updates•Terms of access & use

•Branding & visibility•Data discovery•Ease of use•Scholarly citation•Control over updates•Terms of access & use

Page 6: Using Dataverse Virtual Archive Technology for Research Data Management

How it works?

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 7: Using Dataverse Virtual Archive Technology for Research Data Management

Supporting dataConvert to a preservation format

(data and metadata)Calculate Universal Numerical

Fingerprint (UNF)Download in multiple formatsDownload a subset of the dataGenerate summary statisticsApply Zelig (R) statistical methodsVisualize time seriesDefine Terms of Use and

Permission

Cross, M. Why the Dataverse Network? Available at: thedata.org

Tabular Data:

STATA

SPSS

CSV + control card

Tab delimited + DDI

Social Network Data:

GraphML

Other data or relevant files:

All formats are accepted BUT only tabular files have full data support

Page 8: Using Dataverse Virtual Archive Technology for Research Data Management

Creating data citationsAuthor(s)YearTitlePersistent URL and IDUNFDistributorVersionOther optional fields

Louis Harris and Associates, Inc., 1992, "Harris 1984 Female Veterans Survey, study no. 843002", http://hdl.handle.net/1902.29/H-843002 UNF:3:4VngKZgBorG/7T6aZSaq1g== Odum Institute;Odum Institute for Research in Social Science [Distributor] V1 [Version]

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 9: Using Dataverse Virtual Archive Technology for Research Data Management

Managing data and versions

Contributor, curator, admin view End user view

Data File 1

Data File 1

Data File 2

Data File 2

Edit study & add new file

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 10: Using Dataverse Virtual Archive Technology for Research Data Management

Data never permanently deletedA study is never permanently deleted after it is released. Curators or admins can deaccession the study.

Edit study

This study is deaccessioned. [Go to other study]

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 11: Using Dataverse Virtual Archive Technology for Research Data Management

Supporting standardsStudy and variable metadata are exported

into XML (Dublin Core, Data Documentation Initiative – DDI, FGDC) and MARC

OAI-PMH for harvesting metadataLOCKSS for data duplication in multiple

locationsZ39.50 for distributed searchE-Z Proxy to authenticate for data accessFederations enable via standards

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 12: Using Dataverse Virtual Archive Technology for Research Data Management

Replicating data

Page 13: Using Dataverse Virtual Archive Technology for Research Data Management

Dataverse Virtual ArchivesCustom web skinsResearchers retain control of data accessCitations provide academic credit for data collection workEasy access to online research tools

Page 14: Using Dataverse Virtual Archive Technology for Research Data Management
Page 15: Using Dataverse Virtual Archive Technology for Research Data Management
Page 16: Using Dataverse Virtual Archive Technology for Research Data Management
Page 17: Using Dataverse Virtual Archive Technology for Research Data Management

Dataverse FeaturesFederated search & discoveryOnline analysisMulti-format downloadCollection organizationAutomated metadata generationCustom metadata templatesControlled ingest workflows

Page 18: Using Dataverse Virtual Archive Technology for Research Data Management
Page 19: Using Dataverse Virtual Archive Technology for Research Data Management
Page 20: Using Dataverse Virtual Archive Technology for Research Data Management
Page 21: Using Dataverse Virtual Archive Technology for Research Data Management
Page 22: Using Dataverse Virtual Archive Technology for Research Data Management
Page 23: Using Dataverse Virtual Archive Technology for Research Data Management
Page 24: Using Dataverse Virtual Archive Technology for Research Data Management

Data archiving in 4 steps1. Gather and convert study files to the

appropriate format

2. Log into your virtual archive

3. Add a new study

4. Add the study files

Page 25: Using Dataverse Virtual Archive Technology for Research Data Management
Page 26: Using Dataverse Virtual Archive Technology for Research Data Management
Page 27: Using Dataverse Virtual Archive Technology for Research Data Management
Page 28: Using Dataverse Virtual Archive Technology for Research Data Management
Page 29: Using Dataverse Virtual Archive Technology for Research Data Management
Page 30: Using Dataverse Virtual Archive Technology for Research Data Management
Page 31: Using Dataverse Virtual Archive Technology for Research Data Management
Page 32: Using Dataverse Virtual Archive Technology for Research Data Management
Page 33: Using Dataverse Virtual Archive Technology for Research Data Management
Page 34: Using Dataverse Virtual Archive Technology for Research Data Management

Moving beyond social scienceDataverse Network is cross-disciplinary.We are expanding the study metadata and

building communities of interested groups:[email protected]

Cross, M. Why the Dataverse Network? Available at: thedata.org

Page 35: Using Dataverse Virtual Archive Technology for Research Data Management

Benefits to…Researchers:Gives recognition to authors/researchers Creates a permanent data citation with UNFConverts data and study files to a preservable

formatAllows researchers to set who can access the data

(and modify this at a later point)

IT/Computer support:It’s freeDo not need additional software for DataverseOffload long-term data preservation concerns

Page 36: Using Dataverse Virtual Archive Technology for Research Data Management

Questions?Jonathan Crabtree, Asst. Director for

Archives & ITPhone: (919) 962-0517Email: [email protected] 

Cheryl A. Thompson, Graduate Research AssistantEmail: [email protected]

Email: [email protected]