the data documentation initiative: more discussion chuck humphrey university of alberta atlantic dli...

Post on 21-Jan-2016

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Data Documentation Initiative: more

discussionChuck Humphrey

University of Alberta

Atlantic DLI Workshop 2005, Acadia University

2

Outline

• Two metadata challenges

• The evolution of DDI

• The story of the RDC / DDI project

3

Metadata Challenge #1

The correspondence between metadata and the tools that make use of it tends to be one-to-one.

MARC OPAC

SPSS syntax file SPSS

PDF file Acrobat

SAS commands SAS

4

Metadata Challenge #1

And we have to create and re-create metadata for each application.

MARC OPAC

SPSS syntax file SPSS

PDF file Acrobat

SAS commands SAS

5

Metadata Models for Data

• Here are some examples of historical metadata models for social science data. Notice that the characteristics of the metadata were bound by the tools of the day.

6

7

11

Metadata Challenge #2

Our application tools have tended to constrain our metadata.

Direction of control

Desired service

title search

Choose a tool

card catalog

Tool definesthe metadataformat

3x5 card

We createmetadata tofit a format

Small. (e.g.,3 subjects headings max.)

12

Metadata Challenge #2Consider how the length of variable labels for various statistical software has constrained our metadata about brief variable descriptions.

Statistical Package Max. Length of Var. Labels

SPSS 7.5 120

SPSS 12 255

SAS 6.12 40

SAS 8.0 256

STATA 6.0 80

13

Metadata Challenge #2The dilemma created by limiting our metadata to current tools is that when new tools arise or new services are sought that can make use of richer metadata, we will not have created it and must face re-creating the metadata.

14

Lessons from These Challenges

Metadata should be created to go beyond simple one-to-one use and should be reusable for more than one purpose.

Metadata should be created to describe data, not to meet the needs of one system, one service.

Blaise

SAS

IMDB

PDF

Word

Paper

DDI

IMDBNesstar

OracleLibrary OPACStat Software

Google

PDF, printhtmlRSSDDI 3, 4 ...

Proposal

Sample design

Questionnaire

Pre-test

Revisions

Collection

Processing

Dissemination

Function Tools Metadata Uses

Applying These Lessons

16

DDI Versions 1 & 2

1.0 Document Description2.0 Study Description3.0 Data Files Description4.0 Variable Description5.0 Other Study-related

Materials

The first two versions of DDI were modeled after the traditional ‘codebook’ made up of a user’s guide, data dictionary and record layout.

17

DDI Version 3 (Draft)

The draft for Version 3 is based on a process model and attempts to describe the stages within data creation using a life cycle perspective.

1. Start up 2. Planning 3. Execution 4. Close Out

DDI Versions 3 (Draft)

19

DDI Versions 3 (Draft)

20

Project Partnerships• RDC Network

• RDC’s in the pilot include McMaster, Prairie and Alberta

• RDC Central has a Nesstar Licence

• DLI• DLI Central shares the Nesstar

License and is working on converting PUMF’s to DDI

• DLI EAC approved joining the DDI Alliance

21

Project Partnerships

• General Social Survey

• Permission to use Cycle 17 in the pilot

• Provided a contact to assist with the data documentation

• Standards Division

• Interested in a pilot that would expose the issues of using DDI to document data

22

Project Operation

• No formal budget at this point. All contributions to the project are in kind.

• Irene Wong is conducting the evaluation and creation of DDI documentation in the Alberta RDC.

• Sharon Neary, associated with the Prairie RDC, is coordinating training for end-users.

23

Project Operation

• Byron Spencer is coordinating an evaluation of the Nesstar application of DDI in the McMaster RDC with end-users.

We need for data discovery tools in DLI and the RDCs.

24

Project Status

• The DDI compliant documentation for the GSS Cycle 17 master file has been completed and is now being tested as McMaster’s RDC.

• Irene is completing a report describing the process of creating the DDI version of the documentation and an assessment of DDI strengths and weaknesses.

25

Metadata Life-Cycle Research

One outcome of this project will be to comment on the amount of metadata produced over the life cycle of a survey and to identify the existing tools in which this metadata had been created and stored.

top related