life cycle models & principles jake carlson associate professor of library science data services...

Post on 03-Jan-2016

221 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Life Cycle Models & PrinciplesJake Carlson

Associate Professor of Library ScienceData Services Specialist

Purdue University Libraries

What will be Covered

• An introduction to terms and concepts relating to data lifecycles.

• An understanding of the purpose of lifecycle models.

• Coverage of some life cycle models and principles how they may relate to each other.

• An introduction to ICPSR’s lifecycle model, as a loose framework for this workshop.

Data Science

• “Data science enables the creation of data products.”

• “We're increasingly finding data in the wild, and data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others.”

– Loukides, M. (2011) What is Data Science? http://radar.oreilly.com/2010/06/what-is-data-science.html

Data Curation

• “…the active and on-going management of data through its lifecycle of interest and usefulness to scholarly and educational activities.” - UIUC GSLIS http://cirss.lis.illinois.edu/CollMeta/dcep.html

• “… the value-added activities and features that stewards of content engage in to make the content useful.” - Nancy McGovern, ICPSR

“…the active and on-going management of data through its lifecycle of interest and usefulness to scholarly and educational activities.” - UIUC GSLIS http://cirss.lis.illinois.edu/CollMeta/dcep.html

“… the value-added activities and features that stewards of content engage in to make the content useful.” - Nancy McGovern, ICPSR

What is a Lifecycle?The continuous sequence of changes undergone by an organism from one primary form, as a gamete, to the development of the same form again. http://www.dictionary.com

Graphic: http://insected.arizona.edu/manduca/Mand_cycle.html

Why Use Life Cycle Models?

• Helps define and explain complex processes (graphically).

• Help to identify important components, roles, responsibilities, milestones, etc.

• Demonstrate connections and relationships between parts and the whole.

• Provide a framework to develop services and support.

Limitations of Lifecycle Models

• “All models are wrong, but some are useful”George E.P. Box, Statistician, 1976

–Models generally reflect the interests, perspectives (and biases) of the agencies that created them. –Models mask complexity.–Models tend to overlook heterogeneity / diversity.–Models are often presented as orderly and linear.–Models depict the ideal.

Aspects of Lifecycle Models

• Subject Based– Scholarly Communication– Research– Data– Curation

• Source Based– Individual– Organizational– Community

Scholarly Communication Lifecycles

Scholarly Communication Lifecycles

Gettysburg College Library

Graphic: http://www.gettysburg.edu/library/research/guides/scientific_information/index.dot

Research Lifecycles

Loughborough University Library (UK)Graphic: http://www.lboro.ac.uk/services/library/research/

Scholarly Communication Lifecycles

Microsoft ResearchGraphic: http://research.microsoft.com/en-us/news/features/zentity-052009.aspx

Research Lifecycle: Project

The Research360 Project will develop technical and human infrastructure for research data management at the University of Bath…

Focus in particular on issues and challenges that arise from private sector partnerships and research collaborations;

http://blogs.bath.ac.uk/research360/about/

Research Lifecycles: Specialized

Cross-Cultural Surveys

Institute of Social

Research Graphic: http://ccsg.isr.umich.edu/intro.cfm

Research Lifecycle: Funding

Wayne State University, Division of ResearchGraphic: http://spa.wayne.edu/grant/

Connecting Research & Data Lifecycles

“How JISC is Helping Researchers”http://www.jisc.ac.uk/whatwedo/campaigns/res3/jischelp.aspx

Data Lifecycles

Chuck Humphrey (2006) “e-Science and the lifecycles of Researchhttp://datalib.library.ualberta.ca/~humphrey/lifecycle-science060308.doc

A Data Curation Profile contains:

Information about an individual data set, including it’s data lifecycle.

Current management practice.

Unmet needs.

http://datacurationprofiles.org

Individual Data Lifecycles are Unique

Individual Data Lifecycles can be Complex

Data Lifecycle Model: UVAData Mining

Data Curation & Preservation

Publication Rights & Restrictions

DMP Consulting

Grant Writing & Planning

DM Planning

Metadata & Documentation

Data ProcessingHPC/VisualizationTool Development

Data Storage

Data Search

Image: University of Virginia Libraries Scientific Data Consulting Group: http://dmconsult.library.virginia.edu/

Data Lifecycle Model for ICPSR

1. Proposal and Planning

2. Project Start Up

3. Data Collection

4. Data Analysis

5. Preparing Data for Sharing

6. Deposit

ICPSR’s Guide to Social Science Data Preparation and Archiving:

http://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/

Common Elements in Data Lifecycle

• Collect / Generate• Process• Analyze• Finalize / Summarize for Publication

Curation Lifecycle

Neil Beagrie (2004) “The Continuing Access and Digital Preservation Strategy for the UK Joint Information Systems Committee (JISC)” D-Lib Magazine.http://www.dlib.org/dlib/july04/beagrie/07beagrie.html

Curation Lifecycle: DCC

http://www.dcc.ac.uk/resources/curation-lifecycle-model

OAIS Reference Model: Preservation

ICPSR Pipeline Process

http://staging.icpsr.umich.edu/icpsralpha/content/datamanagement/lifecycle/oais.html

Deposit

Inputs – Materials to Deposit:• Data• Documentation • Data Form (Description)

Outputs – SIP:• Deposited Files • Metadata from the

Deposit• Signed Deposit Form

Ingest

Actions:• Processing Plan• Assign a Study Number• Formatting for Access

and Preservation

Outputs – AIP: • Data• Documentation• Set Up Files• Processing History

Archival Storage

Actions: •Migrations •Checking integrity - checksums •Making, storing and synching redundant copies at various locations

Outputs – Curated AIP

Data Management

Actions:•Populating, •Maintaining,•Making the descriptive information accessible

Outputs:•Compliant Metadata

Access

Actions:•Data set is indexed, searchable and made available. Outcome – DIP:•Data and document files•Bibliography file•Study description file•Terms of use file•File Manifest

Common Elements in Curation Lifecycle

• Deposit / Ingest• Storage• Document / Describe• Discover / Access / Use• Manage• Preserve

Lifecycle Models & Data Services

• Need for developing your organizational model – based on community models and informed by individual lifecycles.

• Need for alignment between data lifecycles and curation lifecycles – informed by research and scholarly communication lifecycles

Alignment Between Lifecycles

Proposal Development &

DMP

Project Start-up

Data Collection & File Creation Data

Analysis Preparing Data for Sharing

Ingest

Data Mgmt

Archival

Access

Research

Scholarly CommunicationAccess

Storage

Ingest Storage

Archival Storage

Example of Lifecycle Alignment

Image: Green, Ann G., and Myron P. Gutmann. (2007). “Building Partnerships Among Social Science Researchers, Institution-based Repositories, and Domain Specific Data Archives.” OCLC Systems and Services: International Digital Library Perspectives, 23: 35-53.

Life Cycle Models & Principles

Jake CarlsonAssociate Professor of Library Science

Data Services Specialist Purdue University Libraries

top related