1 population and housing census editing department of economic and social development united nations...

91
1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F, No.82

Upload: claire-brammer

Post on 02-Apr-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

1

Population and Housing Census Editing

Department of Economic and Social Development

United Nations Statistics Division

Studies in Methods, Series F, No.82

Page 2: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

2

U.S. Census BureauInternational Programs Center

www.census.gov/ipc/www

Microcomputer Processing of

Census and Surveys(using the Census and Survey Processing System – CSPro)

Page 3: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

3

Form

s To

To

Products

Data User

Processing of Census, Survey or Other Form

Computer

Page 4: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

4

 

Two ways of thinkingInformation

Computer Products

DataFile

ReportsTables

Thematic MapsGraphs

Questionnaire

Page 5: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

5

Data Processing Stages:

1. Get ready for enumeration

2. Monitor and evaluate enumeration

3. Capture the data

4. Validate the data [edit]

5. Produce products

Page 6: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

6

Is there a Magic Button to help us?

Page 7: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

7

Page 8: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

8

Form

s To

To

Data Capture

Data File

Data Capture

Page 9: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

9

Data Products

Tabulations

Graphs

Maps

Page 10: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

10

The Goal

Produce useful products from census/survey information.

Useful products are those that meet the needs of the user community.

Produce these products in a quick and efficient manner

Page 11: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

11

Resource Criteria for Census and Survey Processing Time Accuracy Money Staff Regularity Products

Page 12: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

12

Data ProcessingEasy as 1-2-3?

1. Capture the information

2. Validate the information

3. Produce the data products

Page 13: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

13

What software?

There is a lot of data processing software available!

Which best fits your needs? Do you need training? Do you need money? Do you need help?

Page 14: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

14

Why use CSPro?

Designed for census & survey processing Easy to use Modular in design Can be used by novices and/or experts Free Excellent support Windows environment

Page 15: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

15

Census and Survey ProcessingSoftware(CSPro)

Tabulations File descriptions (dictionary) Data entry applications Edit applications Dissemination Products

Page 16: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

16

CSProCensus and Survey Processingis a public-domain software package for

Entering Tabulating Editing Mapping

1. Create Products [tables, maps, etc.]2. Disseminate the results

Census and Survey data

CSPro was designed and implemented through a joint effort among the developers of IMPS and ISSA: the United States Census Bureau, Macro International, and Serpro, S.A. Funding for the development is

provided by the Office of Population of the United States Agency for International Development.

CSPro is designed to eventually replace both IMPS and ISSA. 

Page 17: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

17

Data Dictionary

The data dictionary is the base for most of the parts of CSPro

These parts include: Data entry (CSEntry) Data editing (CSBatch) Tabulation

Page 18: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

18

Data File Design

How are data stored in the data file?

What is a case? What is on a record? How many records?

Page 19: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

19

Objectives:

1. Understand elements of a data file

2. Describe a field, record, and questionnaire

3. Describe data file structures

4. Learn how the CSPro data dictionary defines these elements

Page 20: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

20

Needed information about data file

Need identification fields Need information / data fields Need “SIZE” [how many characters] Need valid values/codes

Page 21: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

21

Data File StructureASCII/text

ALL data on ONE record/line

Different types of data on DIFFERENT record/lines

Page 22: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

22

Data Processing/Data File Terminology

Item/variable/field Record Questionnaire (Case) Data file

Page 23: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

23

Item/variable/field

Is a single piece of information Has the attributes of:

SizeTypeNumeric/Alphanumeric

Age

5 1

Sex

M

Income

- 9 8 7 . 6 5

Page 24: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

24

Record

a collection of related items forming a single line of information.

For example: Housing Record

contains information about the house

Population Record Contains information about each person in the house

Page 25: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

25

Case/Questionnaire

all the records of all types for a processing unit such as a household

Page 26: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

26

A data file is

a collection of all the questionnaires (cases)

Page 27: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

27

CSPro Data Dictionary

1. Field names/labels

2. Field size

3. Field location

4. Field attributes

5. Record names

6. Record types

7. Records ID’s

8. Records allowed by type

Page 28: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

28

Questionnaire sections

Page 29: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

29

One Section ==> One record

Page 30: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

30

From the questionnaire to the data file

(one record type)

Page 31: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

31

From the questionnaire to the data file

(What are the data?)

Page 32: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

32

From the questionnaire to the data file

(Where are the data?)

Page 33: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

33

Data Dictionary describes the data file

Page 34: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

34

CSPro Support

Web site: http://www.census.gov/ipc/www/

E-mail: [email protected]

[End of CSPro demonstration]

Page 35: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

35

UN Editing Handbook

Uses Principles and Recommendations as base

Covers how editing fits into whole process Describes different types of edits Gives examples

Page 36: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

36

Purpose of Handbook

No census data are ever perfect Changes are made -- little documentation Promote communication between subject

specialists and programmers “Cookbook” of suggestions -- presents

possible resolutions But country edit teams must decide

Page 37: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

37

Major Elements in a Census

Preparatory work Enumeration Data processing -- keying, editing and

tabulations Building data bases and dissemination Evaluation of results Analysis of results

Page 38: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

38

Errors in Census Process

Coverage Errors Questionnaire Design Enumerator/respondent errors Coding errors Data entry errors Computer editing errors Tabulation errors

Page 39: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

39

Errors Generated During Census Processing

Activity Type of Error ┌───────────────────┐ │ Enumeration │ Respondent errors └─────────┬─────────┘ Enumerator errors V ┌─────────┴─────────┐ │ Field Editing │ Field checking └─────────┬─────────┘ Office checking V ┌─────────┴─────────┐ │ Office Coding │ Miscodes └─────────┬─────────┘ V ┌─────────┴─────────┐ │ Data Capture │ Miskeys └─────────┬─────────┘ V ┌─────────┴─────────┐ Logic Errors │ Computer Editing │ Misallocation └─────────┬─────────┘ Miscorrection V ┌─────────┴─────────┐ │ Tabulation │ Distribution of └─────────┬─────────┘ unknowns V ┌─────────┴─────────┐ │ Publication │ Misprints └───────────────────┘

Page 40: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

40

Editing in Historical Perspective

Before computers: manual editing With computers: Increased complexity Automated changes Generalized editing packages New philosophies of editing Personal computers Appropriate levels of computer editing

Page 41: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

41

Editing Team

Appropriate internal subject matter specialists

Computer Programmers Work together as a team Edit Specs as means of communication Outside experts -- academicians Outside experts -- private sector

Page 42: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

42

WHAT CENSUS EDITING SHOULD DO

1 Give users measures of the quality of the data

2 Identify the types and sources of error, and

3 Provide adjusted census results

Page 43: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

43

TABLE 1. SAMPLE POPULATION BY 15-YEAR AGE GROUP AND SEX,USING UNEDITED AND EDITED DATA

Unedited data Edited dataAge group Total Male Female Not

reportedTotal Male Female

Total 4147 2033 2091 23 4147 2045 2102Less than 15 years 1639 799 825 15 1743 855 88815 to 29 years 1256 612 643 1 1217 603 61430 to 44 years 727 356 369 2 695 338 35745 to 59 years 360 194 166 0 341 182 15960 to 74 years 116 54 59 3 114 53 6175 years and over 34 12 22 0 37 14 23Not reported 15 6 7 2

Sample table with & without unknowns

Page 44: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

44

TABLE 2. POPULATION AND POPULATION CHANGE BY 15-YEAR AGEGROUP WITH UNKNOWNS: 1990 AND 2000

Numbers Per centAge group

2000 1990

NumberChange

Per centChange 2000 1990

Total 4147 3319 828 24.9 100.0 100.0

Less than 15 years 1639 1348 291 21.6 39.5 40.6

15 to 29 years 1256 902 354 39.2 30.3 27.2

30 to 44 years 727 538 189 35.1 17.5 16.2

45 to 59 years 360 200 160 80.0 8.7 6.0

60 to 74 years 116 89 27 30.3 2.8 2.7

75 years and over 34 25 9 36.0 0.8 0.8

Not reported 15 217 -202 -93.1 0.4 6.5

Table showing trends with unknowns

Page 45: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

45

Basics of Census Editing

Systematic inspection and change (not always correction)

Fatal edits -- invalid or missing entries Query edits -- inconsistencies Must preserve the original data as much as possible Quality enumeration more important than editing Edit does not improve data quality -- makes more

esthetic Team must determine how far to do

Page 46: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

46

More of Basics

Over-editing is harmful Treatment of unknowns Spurious changes Determining tolerances Learning from the edit process Quality assurance Costs of Editing Imputation Archiving

Page 47: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

47

How Over-editing is Harmful

Timeliness Finances Distortion of true values A false sense of security

Page 48: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

48

Editing Applications

Manual versus automatic correction Guidelines for correcting data Validity and consistency checks Methods of correcting and imputing data Other editing systems

Page 49: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

49

Manual versus Automatic Correction Manual correction: takes a long time and

very subject to error Automatic correction: faster and consistent. Not necessarily correct, just consistent. Can look at many variables at the same time Can keep an audit trail

Page 50: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

50

Guidelines for Correcting Data

Make the fewest required changes possible to the originally collected data

Eliminate obvious inconsistencies among the entries

Systematically supply entries for erroneous or missing items by using other entries for the housing unit, person, or other persons in the household or group

When appropriate, use “not reported”

Page 51: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

51

Page 52: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

52

Dangers in editing

Male with fertility – so fertility deleted Second male in spouse pair made female Then, Female without fertility – so fertility

imputed So, before one error – now the initial error

remains, but we have three MORE errors

Page 53: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

53

Example of B-A-D Edit Changes

Person Relationship Sex Children ever born

Unedited data

1 Head of household Male 03

2 Spouse Male BLANK

Data after editing for sex

1 Head of household Female 03

2 Spouse Male BLANK

Page 54: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

54

Sample house for hotdeck example

ID number Relationship Sex Age

123456789

10

1233444555

121[ ] 212[ ] 12

3935131040

[ ] 13

[ ] 4436

Page 55: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

55

Initial and Final Hot Deck Values for single family

Initial values Relationships

Head of household Spouse Son/daughter Other relative Non-relative

(1) (2) (3) (4) (5)

Male (1) 35 35 12 40 40

Female (2) 32 32 12 37 37

Values after changes

Relationships

Head of household Spouse Son/daughter Other relative Non-relative

(1) (2) (3) (4) (5)

Male (1) 39 35 13 40 44

Female (2) 32 35 12 13 36

Page 56: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

56

Validity and Consistency Checks

1. Top-down editing approach

2. Multiple variable edit

3. Coding considerations

Page 57: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

57

1. Top Down Approach: Order of Edits HOUSING

VARIABLES ON QUESTIONNAIRE

Type of Dwelling Rooms Walls Roof Tenure

HOUSING VARIABLES– ORDER OF EDITS

Tenure Type of Dwelling Rooms Walls Roof

Page 58: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

58

2. Multiple Variable Approach – Young Widowed Head with 3 ChildrenNumber Rule Relation Sex Age MarStat Fertility

1 Head of household should be 15 years or older 1 1

2 Spouse should be 15 years or older

3 A “spouse” should be married

4 If spouse present, head of household should be married

5 If spouse present, head of household and spouse should be opposite sex

6 Person less than 15 years old should be never married 1 1

7 Male should have no fertility

8 Female less than 15 years old should have no fertility 1 1

9 For female 15 years or older fertility entry should not be blank

10 A “child” should be younger than head of household

11 A “parent” should be older than head of household

Totals 1 1 3 1 0

Page 59: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

59

3. Common Codes Assist in Editing

Group Birthplace Citizenship Language Ethnicity

France/French 10 10 10 10

Spain/Spanish 20 20 20 20

Latin America 25 25 20 25

Philippines/Filipino

30 30 30

Ilokano 32

Tagalog 32

England/English 40 40 40 40

Canada 50 50 40 50

USA 52 52 40 52

Page 60: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

60

Methods of Correcting and Imputing Data

1. Change to unknown

2. Static or “Cold Deck” imputation

3. Dynamic or “Hot deck” imputation

Page 61: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

61

1. Changing to Unknown – When you don’t have enough information

Usually in censuses, we don’t have enough information to get a good estimate of paid occupations and industries:

If not OCCUPATION in 001:997 then errmsg (“Occupation is invalid, assign unknown”); OCCUPATION = 998; Endif;

Page 62: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

62

Changing to unknown: Countries choosing not to impute These days, most countries impute at least items

needed for planning and policy determination If a country still decides not to impute Then, staff might assign “unknown” even items

used for planning:

If SEX is not 1 or 2 then SEX = 9 endif

Page 63: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

63

2. Static Imputation – Making young people “Never married” In Static Imputation, the same value or values are always

assigned:

If AGE < 15 then if MARITAL_STATUS <> NEVER_MARRIED then errmsg (“Young person not never married”); MARITAL_STATUS = NEVER_MARRIED; endif; Endif;

Page 64: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

64

A kind of static imputation: changing using logical values Since we have only two sexes, we can alternate

between them when invalid or inconsistent values appear:

Keep a cell in the computer’s buffer: XSEX If SEX is unknown then let SEX = XSEX change XSEX to the other SEX for the next

usage endif

Page 65: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

65

3. Hot Deck Imputation

Geographic considerations Use of related items Sequence of the items Complexity of the matrices Standardized hot decks Size of hot decks -- too big, audit trail,

too small, difficult items

Page 66: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

66

Types of Edits

Structure edits – Bookkeeping, getting each locality within each minor civil division within each major civil division

Content edits – Housing items Content edits – Population items Content edits – Inter-record checking

Page 67: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

67

Standard Edit: Language Edit

If this is the head and language is missing, first look for someone else in the house with language, and assign that.

If this is the head without language, no one else has language, use neighboring head of similar characteristics to assign a best guess.

If this is someone else in the house and language is missing, assign the head’s language.

Page 68: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

68

PROC LANGUAGE errmsg (" ******* Language ************ "), summary;

{. . ****************************************************************************** . ************** **************** . ************** Language edit **************** . ************** **************** . ****************************************************************************** .} if LANGUAGE in 1:17 then if RELAT = 1 then ALANGUAGE (AGE10,SEX) = LANGUAGE; endif; else if RELAT = 1 then PERSONPTR = 0; do varying i = 1 until i > TOTOCC (POP_EDT) if LANGUAGE (i) in 1:17 then PERSONPTR = i; endif; enddo; if PERSONPTR = 0 then errmsg("*D05-2* LANGUAGE imputed from Age and Sex, pn= %02d, lang= %01d", PERSNUM,LANGUAGE) denom = denomPop summary; F1F2(); write("*D05-2* LANGUAGE imputed from Age and Sex, pn= %02d, lang= %01d", PERSNUM,LANGUAGE); impute( LANGUAGE , ALANGUAGE (AGE10,SEX)); else errmsg("*D05-3* Head's LANGUAGE imputed from Other's LANGUAGE, pn= %02d, lang= %01d, personptr= %02d, pr-lang %01d", PERSNUM,LANGUAGE,PERSONPTR,LANGUAGE(PERSONPTR)) denom = denomPop summary; F1F2(); write("*D05-3* Head's LANGUAGE imputed from Other's LANGUAGE, pn= %02d, lang= %01d, personptr= %02d, pr-lang %01d", PERSNUM,LANGUAGE,PERSONPTR,LANGUAGE(PERSONPTR)); impute (LANGUAGE , LANGUAGE (PERSONPTR)); endif; endif; endif; else F1F2(); errmsg("*D05-4* LANGUAGE imputed from Head's LANGUAGE, lang %d langhd %d", LANGUAGE,LANGUAGE(headpt)) denom = denomPop summary; write("*D05-4* LANGUAGE imputed from Head's LANGUAGE, pn= %02d, lang= %01d, langhd= %01d",persnum,LANGUAGE,LANGUAGE(headpt)); impute (LANGUAGE , LANGUAGE (1)); endif; endif;

Page 69: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

69

Language OK and head, update the hotdeck

For the Standard edit, if the variable is valid, we update the hot deck.

This is the code:

if LANGUAGE in 1:17 then if RELAT = 1 then ALANGUAGE (AGE10,SEX) = LANGUAGE; endif;

Page 70: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

70

Single person house, get language from nearby house

Normally, we want to look for others in the house with the variable.

But, in one-person houses, no one else to look at, so we have to impute:

if RELAT = 1 then if TOTOCC (POP_EDT) = 1 then errmsg("*D05-2A* Single person house: Language imputed from Age and Sex, pn= %02d, lang = %01d",PERSNUM,LANGUAGE) denom = denomPop summary; F1F2(); write("*D05-2A* Single person house: Language imputed from Age and Sex, pn= %02d, eth= %01d",PERSNUM,LANGUAGE); impute( LANGUAGE , ALANGUAGE (AGE10,SEX));

Page 71: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

71

Someone else in house has language, assign that to head Assign the first other person’s language to head: else PERSONPTR = 0; do varying i = 1 until i > TOTOCC (POP_EDT) if LANGUAGE (i) in 1:17 then PERSONPTR = i; endif; enddo; if PERSONPTR = 0 then errmsg("*D05-2* LANGUAGE imputed from Age and Sex, pn= %02d, lang= %01d", PERSNUM,LANGUAGE) denom = denomPop summary; F1F2(); write("*D05-2* LANGUAGE imputed from Age and Sex, pn= %02d, lang= %01d", PERSNUM,LANGUAGE); impute( LANGUAGE , ALANGUAGE (AGE10,SEX));

Page 72: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

72

If no one else has language, get from nearby head same age and sex

No one else has a valid entry for this item, so impute from the nearest neighbor with a valid entry:

errmsg("*D05-3* Head's LANGUAGE imputed from Other's LANGUAGE,

pn= %02d, lang= %01d, personptr= %02d, pr-lang %01d", PERSNUM,LANGUAGE,PERSONPTR,LANGUAGE(PERSONPTR)) denom = denomPop summary; F1F2(); write("*D05-3* Head's LANGUAGE imputed from Other's LANGUAGE, pn= %02d, lang= %01d, personptr= %02d, pr-lang %01d", PERSNUM,LANGUAGE,PERSONPTR,LANGUAGE(PERSONPTR)); impute (LANGUAGE , LANGUAGE (PERSONPTR));

Page 73: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

73

For others in house, assign head’s language

Once the head has a valid entry for the variable, others can obtain theirs from the head:

F1F2(); errmsg("*D05-4* LANGUAGE imputed from Head's LANGUAGE, lang %d langhd %d", LANGUAGE,LANGUAGE(headpt)) denom = denomPop summary; write("*D05-4* LANGUAGE imputed from Head's LANGUAGE, pn= %02d, lang= %01d, langhd= %01d",persnum,LANGUAGE,LANGUAGE(headpt)); impute (LANGUAGE , LANGUAGE (1));

Page 74: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

74

Language Edit: Within House

Example of WRITE Statement in CSPro to assist in finding the error

Note: Before and after edit displays, with what is done in edit in the middle

Assigning Head’s language from other people91200217 Population Group Case = 0009 ID1 ID2 PN SEX AGE RTN GRP LAN REL RSA PRV CNT CTZ URS PERMPLAC SM 01 1 034 01 1 55 1 09 1 1 02 2 023 02 1 06 55 1 07 1 1 03 2 005 03 1 06 55 1 09 1 1 04 2 003 03 1 06 55 1 09 1 1V.14c: P07 invalid for head, imputing from other PN = 01 Lang = Oth lang = 06V.14c: P07 invalid for head, imputing from other PN = 01 Lang = 06 Oth lang = 06V.14c: P07 invalid for head, imputing from other PN = 01 Lang = 06 Oth lang = 06end ID1 ID2 PN SEX AGE RTN GRP LAN REL RSA PRV CNT CTZ URS PERMPLAC SM 01 1 034 01 1 06 55 1 09 1 1 02 2 023 02 1 06 55 1 07 1 1 03 2 005 03 1 06 55 1 09 1 1 04 2 003 03 1 06 55 1 09 1 1

Page 75: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

75

Language Edit: Imputed Head from Previous Household Head No one has language, so first head gets language

from previous head of same age and sex Then the others in the house get their language

from the head91200697 Language Case = 0027 ID1 ID2 PN SEX AGE RTN GRP LAN REL RSA PRV CNT CTZ URS PERMPLAC SM 01 1 027 01 1 1 09 1 1 02 2 027 02 1 1 09 1 1 03 1 005 03 1 1 09 1 1V.14d: P07 invalid, imputing from deck ALANGUAGE PN = 01 Lang =V.15d: P08 invalid for head, impute from deck ARELIGIO PN = 01 Head Relig =V.14f: P07 invalid, imputing from head PN = 02 Lang = Head's lang = 06V.15f: P08 invalid, imputing from head's religion PN = 02 Relig = Head'srelig = 38V.14f: P07 invalid, imputing from head PN = 03 Lang = Head's lang = 06V.15b: imputing P08 from mother's religion PN = 03 Relig = Mo relig = 38end ID1 ID2 PN SEX AGE RTN GRP LAN REL RSA PRV CNT CTZ URS PERMPLAC SM 01 1 027 01 1 06 38 1 09 1 1 02 2 027 02 1 06 38 1 09 1 1 03 1 005 03 1 06 38 1 09 1 1

Page 76: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

76

Series of Edit Problems

We are going to do three exercises that will look at different kinds of editing problems

Note: these are all simplified – most of the time edits must be more complicated

But these cover the basics

Page 77: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

77

Countries choosing not to impute

Exercise 1 in the packet: simple edits for population items These days, most countries impute at least items needed

for planning and policy determination If a country still decides not to impute Then, staff might assign “unknown” even items used for

planning:

If SEX is not 1 or 2 then SEX = 9 endif

Page 78: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

78

A simple kind of edit: when SEX is invalid Since we only have two sexes, the easiest way to edit is to

alternate between the sexes:

SEXCHANGE = 2 . . . If not SEX in 1:2 then SEX = SEXCHANGE; SEXCHANGE = 3 – SEXCHANGE; Endif;

The program will assign the stored sex and then will change the holding variable to await the next instance of “bad” sex

Page 79: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

79

What if this person has complete fertility information? Use other intra-record variables to assist in an edit.

If this person has an invalid entry for sex but has fertility information:

If not SEX in 1:2 then if FERTILITY <> NOTAPPL then errmsg (“Has Fertility info so Female”); SEX = 2; endif; Endif;

Page 80: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

80

But what if this is the Spouse and we know the Head’s Sex

For Programmers: Use inter-record information when it is available:

So when Head has sex reported, but the Spouse does not:

If SEX (1) in 1:2 then errmsg (“Sex of spouse from sex of head”); SEX = 3 – SEX(1); Endif;

Page 81: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

81

Exercise 2: Housing edits

Since housing does not usually require crosstabulations, except by geography, edits tend to be more simple

But still must edit for invalids and certain inconsistencies

Page 82: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

82

Housing edits: Rooms and bedrooms

When a census collects both rooms and bedrooms, the numbers of bedrooms should not be more than the number of rooms

Some countries collect the information independently – rooms except bedrooms, and then bedrooms, so this edit would not work

Edit: If Bedrooms > rooms, then make them the same

Page 83: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

83

Housing edits: Walls and Roof

Each variable needs a separate edit If you use hotdeck, then invalids need to be

assigned from nearest neighbor with similar characteristics

Then, you need to check for inconsistencies For example, if you have a house with a concrete

roof but thatch walls, the roof would collapse the walls, so you need an edit to correct for this

Page 84: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

84

Inter-record checking – one record type Sometimes you need to look between records, not

just within a record For example, Each household should have one and

only one head [This is exercise 3] So you need to look through house counting the

heads Need to make sure you have exactly one head So at least one head and not more than one head!!

Page 85: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

85

Inter-record checking for spouses

Does every household have to have one and only one spouse?

Consider polygamous houses … do multiple spouses even live together ?

What about other types of household structures?

Page 86: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

86

Other types of inter-record checking

If a spouse is present, the sex of the head and the sex of the spouse should be opposite

[This may no longer hold in some countries] If a spouse is present, both the head and the

spouse should be reported as “married” or in “common-law” arrangement – and these should be the same

Page 87: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

87

Inter-record checking for population edits

Age of head and age of spouse

Figure 4. Example of household with potential inconsistencies in age reporting Father

Head of household Spouse (age 43) (age 70)

Son Daughter (age 10) (age 8)

Page 88: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

88

Figure 4. Example of household with potential inconsistencies in age reporting Father

Head of household Spouse (age 43) (age 70)

Son Daughter (age 10) (age 8)

WHAT IS WRONG HERE?

Note: Head is 43 years old, Spouse is 70

Note: Children are 10 and 8

SO: need to change age of spouse

Page 89: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

89

Inter-record checking for age

Need to use a hot deck – you have choices You could have a hot deck with age of head

and age of spouse for previous households OR, you could have a hot deck with age

differences between heads and spouses In either case, you should have separate

categories for males and females – because they act differently

Page 90: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

90

Inter-record checking: Between Record Types Until now we looked at one record type – Population or

Housing – but sometimes we need to compare them Vacant houses should have no people and occupied houses

should have people CSPro code: if TENURE = 5 then {Vacant unit} if TOTOCC (POP) <> 0 then {people in vacant unit} [determine tenure – owning or renting – code note shown here] endif; Else {For owned or rented units} if TOTOCC(POP) = 0 then {no people in an occupied unit} impute (TENURE,5); {make this a vacant unit} endif; Endif;

Page 91: 1 Population and Housing Census Editing Department of Economic and Social Development United Nations Statistics Division Studies in Methods, Series F,

91

THANK YOU

UN Editing Specifications Workshop