imputation in the 2001 census

35

Upload: raymond-sheppard

Post on 03-Jan-2016

43 views

Category:

Documents


0 download

DESCRIPTION

Imputation in the 2001 Census. Robert Beatty NILS User Forum 11 December 2009. Coverage. How Census deals with Missing households Missing people within households Incomplete returns. Coverage. Census is statutory Census Act (Northern Ireland) 1969 Penalties for non-compliance - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Imputation in the 2001 Census
Page 2: Imputation in the 2001 Census

Imputation in the 2001 CensusImputation in the 2001 Census

Robert Beatty

NILS User Forum

11 December 2009

Page 3: Imputation in the 2001 Census

CoverageCoverage

How Census deals withMissing householdsMissing people within householdsIncomplete returns

Page 4: Imputation in the 2001 Census

CoverageCoverage

Census is statutoryCensus Act (Northern Ireland) 1969Penalties for non-complianceTherefore counts everyoneDoesn’t it?

Page 5: Imputation in the 2001 Census

CoverageCoverage

Population in thousands

Published Census figure

MYE

1991 1,578 1,607

Page 6: Imputation in the 2001 Census

CoverageCoverage

Population in thousands

Published Census figure

MYE

1991 1,578

(enumerated)

1,607

(best estimate)

Page 7: Imputation in the 2001 Census

Coverage - internationalCoverage - international

Australia 2006 – 96% coverageDon’t impute but adjust MYEsNew Zealand 2006 – 95% response rateNZ imputed for non-response, but only on 4

key variablesCanada ‘adjust for non-responding

households’ – need to know about occupied households

Page 8: Imputation in the 2001 Census

Adjustment issuesAdjustment issues

1991 coverage – 98%But inference about population?Non-response not homogeneousYoung adultsLower social classDeprived areas

Page 9: Imputation in the 2001 Census

Coverage - 2001Coverage - 2001

Acknowledge under-enumeration1991 Census 1,578k MYE 1,607kDecision to adjust Census 2001 databaseObjective – all Census outputs to fully

reflect whole population‘One Number Census’Census = MYE

Page 10: Imputation in the 2001 Census

CoverageCoverage

Population in thousands

Published Census figure

MYE

1991 1,578

(enumerated)

1,607

2001 1,685

(adjusted)

1,689

Page 11: Imputation in the 2001 Census

Coverage - 2001Coverage - 2001

‘One Number Census’ method Basic principle to use a large-scale Census

Coverage Survey (CCS) to estimate under-enumeration in sampled areas

Apply survey estimates elsewhere

Page 12: Imputation in the 2001 Census

Census Coverage SurveyCensus Coverage Survey

UK split into about 100 Estimation Areas (each about 0.5m population)

Three in Northern IrelandAbout 200 postcodes / 3,000 households per

Estimation AreaThree socio-economic strata within EASeparate analysis in each strata within EA

Page 13: Imputation in the 2001 Census

Census Coverage SurveyCensus Coverage Survey

Fieldwork about 3 weeks after Census dayFace to face interviewsTrained interviewersGiven map of postcode boundaryAsked to re-enumerate the postcodeShort questionnaire - coverage

Page 14: Imputation in the 2001 Census

MatchingMatching

Forms scanned into systemSpecial matching software developedDatabase retrieval systemCCS returns carefully matched with Census

returns – error rate estimated to be under 0.1 per cent

Page 15: Imputation in the 2001 Census

Dual System Estimator (DSE)Dual System Estimator (DSE) Use matched Census and CCS data DSE estimates adjustment for those missed in both

Census and CCSCounted By CCS

Yes No

Counted Yes n11 n10 n1+

By Census No n01 n00 n0+

n+1 n+0 n++

DSE estimate for the area (under certain assumptions):

n++ = n1+ n+1 n11

Page 16: Imputation in the 2001 Census

DSE : Simple ExampleDSE : Simple ExampleFish pondFish pond

Day 1: Catch 950 fish, mark with a red dot. Day 2: Catch 900 fish, mark with a blue dot. Matched: 855 had blue and red dots. Question – how many fish in the pond?

Page 17: Imputation in the 2001 Census

Dual System Estimator (DSE)Dual System Estimator (DSE)

Counted Day 2

Yes No

Counted Yes 855 95 950

Day 1 No 45 n00 n0+

900 n+0 n++

DSE estimate of the actual number of fish:

n++ = 950 900 855 = 1,000

Page 18: Imputation in the 2001 Census

AnalysisAnalysis

Separately for each age-sex group, within each stratum, within each EA

Apply DSE method to each sampling point (postcodes) within CCS area

Estimate function DSE = f(observed count)Apply to all other sampling points within

stratum (within EA), and aggregate

Page 19: Imputation in the 2001 Census

Ratio EstimationRatio Estimation

Regression-type estimator

Each dot represents a CCS area

Use Census figure to estimate “true” figure

Census

DSE

Page 20: Imputation in the 2001 Census

The One Number Census The One Number Census processprocess

CENSUS

CENSUS + CCS

ESTIMATE BY AGE AND SEX

FOR EA

EA ESTIMATES

ADJUSTED INDIVIDUAL AND

HOUSEHOLD DATA AND TABLES

NI POPULATION ESTIMATE

MATCHING

Quality Assurance

Dual System and regression estimation

CCS

QA

Imputation controlled to EA

estimates

Sum

Page 21: Imputation in the 2001 Census

Imputing householdsImputing households

Use dummy forms as locationUse dummy forms as ‘constraint’?Dependence on enumeratorsIreland 2006 – 15% of properties vacant

Page 22: Imputation in the 2001 Census

One Number Census outcomeOne Number Census outcome

2001 Census response rate of 95%4.3% in wholly imputed households (mostly

linked to dummy forms(3.0%))0.4% additional people in already

enumerated householdsImputed 80,000 people

Page 23: Imputation in the 2001 Census

CoverageCoverage

Population in thousands

Published Census figure

MYE

1991 1,578

(enumerated)

1,607

2001 1,685

(adjusted)

1,689

Page 24: Imputation in the 2001 Census
Page 25: Imputation in the 2001 Census

Distribution of LAD Level Underenumeration

0

10

20

30

40

50

60

70

80

Response Rate

Fre

qu

en

cy

Page 26: Imputation in the 2001 Census

Response rates by ageResponse rates by age

80

85

90

95

100

'0-4

'10-

14

'20-

24

'30-

34

'40-

44

'50-

54

'60-

64

'70-

74

'80-

84

age

male

female

Page 27: Imputation in the 2001 Census

Quality of returnsQuality of returns

So far, considered non-respondentsPerson & Household imputationWhat about quality of returns actually

made?Decision taken to go for ‘complete’ returnsItem imputation

Page 28: Imputation in the 2001 Census

Edit and Impute - EditEdit and Impute - Edit

Limited number of ‘hard’ edits – can’t be married if aged under 16

Larger number of ‘soft’ edits - quality

Page 29: Imputation in the 2001 Census

Edit and Impute - ImputeEdit and Impute - Impute

General principle of ‘complete’ data setNo ‘Not stated’ entries in outputsItem imputation usedDonor imputation systemNo different in principle to systems used in

sample surveys

Page 30: Imputation in the 2001 Census

Edit and Impute - ImputeEdit and Impute - Impute

Level of item imputation differed by variable

Not applied to religion

Page 31: Imputation in the 2001 Census

SummarySummary

Objective in 2001 that Census outputs should reflect whole population

Person and household imputation5% of persons imputedComplete records generated for all returns

through ‘item’ imputation

Page 32: Imputation in the 2001 Census

I told them in 1951 it was just you, me and the dog, but they keep coming back every 10 years to check.

Page 33: Imputation in the 2001 Census

Looking forwardLooking forward

Date for your diaries …27 March 2011

Page 34: Imputation in the 2001 Census

Any questions?Any questions?

Page 35: Imputation in the 2001 Census

Usual residence definitionUsual residence definition

Historical – present on nightMost countries now ‘usually resident’Definitions do exist (UN)2001 – self-assessed2011 – instructions‘Intention to stay’