Transcript

Incentivising the uptake of reusable

metadata in the survey

production process

ESRA15

Reykjavik

July 2015

Louise Corti

Collections Development and

Producer Support

Why worry about metadata?

• No universal language used to document

questions and variables

• Too many bespoke systems and vocabularies

around

• Massive waste of human resource in the survey

data lifecycle

• Interoperability saves money

• Why don’t we all use the Data Documentation

Initiative (DDI)?

Who needs incentivising?

Show how to exploit metadata for surveys

• Challenge – to get established survey operations to recognise the benefits of reusable metadata

• Midlife study in the US (MIDUS) quite unique!

• Help funders, owners and producers ‘See the light’

• For this we need to show something very cool

• Some good experimental stuff happening

Benefits of publishing rich survey metadata

• Survey documentation systems

• Question banks

• Survey data exploration systems

• Nesstar

• SDA

• Bespoke visualisation systems

Published outputs – question bank

Published outputs – online access

The reality

• Hard to match up Question and Variable information

• Too much manual data entry involved in publishing

• Must do better

• Gain rich reusable metadata from the survey design and production process

Survey production lifecycle

• Beset with manual processes

• Legacy systems

• Reluctancy to change or adapt systems

• Hard to embrace new ways – disruptive,

expensive

Typical process – worst case scenario

• Manual questionnaire entry

(doc/excel/database)

• Export in word format

• Deliver to survey agency

• Manual transfer to IBM Data Collection

• Export SPSS and PDF/word questionnaire

Survey Metadata: Barriers & Opportunities

Workshop: 26 June 2014

Meeting outcomes

• Great turn out and knowledge exchange!

• Quick turn around of principles into a ‘campaign’

document and a published ‘Questionnaire profile’

• Some very positive responses – shared problem

• Be an advocate!

Increasing use of XML for survey design and

publishing

Such as:

• Social science data archive published survey

metadata (DDI 2.5)

• Essex panel studies - bespoke XML Questionnaire

Specification Language for survey design

• UK LifeStudy – survey design instrument – XML

Discussing DDI implementation today

• CLOSER cohorts portal using DDI 3.2 Questionnaire

Profile

• DASHISH DDI 3.2 use

• Blaise – import by Michigan Questionnaire

Documentation System (MQDS) DDI 3

• IBM Data Collection DDI experiments

Short brochure for sharable survey products

• Work closely with data owners and producers

• Existing information on data sharing complex

• What is really expected!

• Transferrable information

• Not a bible

Sticks?

• Specifying data documentation requirements in the

commissioning tender for fieldwork

• Mapping between questions and data outputs

• Improved readable questionnaire for end users

CLOSER project

• Funded variable/question discovery service

• Long-running birth cohorts & longitudinal studies

• Drivers for project

• Harmonisation (biomedical, socio-economic)

• Capacity building

• Data Linkage

• Impact

• Discovery

• Encourage use of existing data resources

• Tools for enhancing survey metadata

Incentives for CLOSER PIs?

• Large award to get prestigious cohort studies on board £££

• Reduce burden - enhancement work done centrally

• Survey data managers

happy to be part of peer group

rewarding to to go back and look at data

liked a shared controlled vocabulary

Received training

variable to questionnaire mappings useful

liked visibility of their study in the search platform

Forward looking survey design

• Think upfront about reusability of questionnaire metadata

• New studies – new opportunities

• Legacy work to get old messy survey design metadata into a new environment – may be worth investing in

• Can make harmonisation work so much easier – XML schema allow formal linkages of variables across time, equivalence, differences etc.

Data publishers

• Survey owners/producers - documentation online

• Question banks

• Journals - supporting data with sufficient metadata

• Use DDI 3.2 Questionnaire profile, not bespoke

schemas

Self-deposit expectations?

• Peer review of data by data centres for all data

published – includes quality of metadata

• Journals – no unified standard for data description

or documentation

• Start with minimal metadata expectations:

• data collection description

• provenance

• data description: file and variable names, labels,

• relationships between tables/files

Some tips on incentivising

• Speak a common language

• On DDI, don’t drown in detail; use existing profiles

• Start with the lowest common denominator. Baby steps

• Show value – shiny interfaces and examples!

• Provide supporting tools where possible e.g. metadata entry

• Integrate into everyday workflows and research tools

CONTACT

UK Data Service

University of Essex

Wivenhoe Park

Colchester

Essex CO4 3SQ

• ……………..…..………………………..

T +44 (0)1206 872145

E [email protected]


Top Related