First Thoughts on Editing in Mixed Modes in the 2011 Census
Heather Wagstaff and Ruth Wallis
Methodology Directorate
Office for National Statistics, U.K.
Overview
The presentation is structured as follows:
Overview of 2001 Census Editing Process
Challenges for editing in mixed modes
Integration of processing streams
Concluding remarks
Overview: 2001 Census Capture and Coding
Capture and coding operation outsourced:
scanning and capture by OMR / OCR
preliminary univariate edits applied
Statistics Canada ACTR for complex coding
integrated quality assurance system
207 million tick boxes and 1.1 billion characters sent for keyers for correction
Overview: Lessons Learned
Overall the 2001 Census editing process was a success
However, unexpected scanning & electronic capture errors:
almost 48k duplicate individual records
almost 3.3 million spurious individuals
households containing only children
Internet Capture: Validity vs Legibility
Between mode combinations of validity and legibility:
Response Paper Internet Type Valid Legible Valid Legible
1
2 x - - 3 x x 4 x x - -
Mode effect: legibility between paper/internet responses
Hence, internet capture totally eradicates scanning error
Internet Capture: Automated Routing
Evaluation of the 2001 Census responses found:
a number of respondents had difficulties understanding the requirements and did not follow the instructions.
Routing can be automated on-line:
questions that are ‘not applicable’, should not be presented to the respondent
but respondents should be aware that they have skipped through the questions.
Internet Capture: Radar Buttons
On-line interface provides opportunity to apply edits in real time
radar buttons negate multi-ticked responses
Useful on-line messages for editing to: highlight item non-response or partial item non-
response; highlight values outside of a pre-specified range
for numeric responses; or simply ask for confirmation of implausible values.
Internet Capture: Personalisation
UK Census form fillers record names three times:
listing grid; relationship matrix; person questions.
2001 evaluation found large number records with inconsistent ordering
personalisation maintains consistency of ordering
‘How is Mary’s related to John’
improve quality and reduce edits
Internet Capture: Complex Coding
Concerned with accuracy and consistency.
Must ensure:
individual coders assign the same code over time and coders assign the same codes as each other.
Internet Capture: Complex Coding
Standard Occupational Classification
Consistency conceptually difficult for on-line capture
Integration of Responses
Integration process is challenging and possibilities are dependent on:
formatting of the Internet form and questions
functionality, inc editing, applied in the interface
Two broad options, if:
raw tick and text: then integrate immediately after scanning and prior to recognition; or if
fully captured & simple coded: then integrate immediately after recognition and prior to edit
Concluding Remarks
ONS aim for internet capture is to:
improve response & reduce respondent burden improve data quality deliver efficiency gains by reducing the volume of
subsequent editing
But there is still much work to do to:
understand the key drivers and levels of bias; and
develop strategies to mitigate against it.