overview of data quality issues in mics

39
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop Overview of Data Quality Issues in MICS

Upload: naiya

Post on 22-Feb-2016

38 views

Category:

Documents


1 download

DESCRIPTION

Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop. Overview of Data Quality Issues in MICS. Data quality in MICS. Important to maintain data of the highest possible quality! - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overview of Data Quality Issues  in MICS

Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and

Dissemination Workshop

Overview of Data Quality Issues in MICS

Page 2: Overview of Data Quality Issues  in MICS

2

Data quality in MICS

Important to maintain data of the highest possible quality!

Important to examine data quality carefully before/during the interpretation of survey findings

Page 3: Overview of Data Quality Issues  in MICS

3

Looking at data quality – Why?

Confidence in survey results

Identify limitations in results Inform dissemination and policy formulation,

avoid misleading policy makers, third parties

ALL SURVEYS ARE SUBJECT TO ERRORS

Page 4: Overview of Data Quality Issues  in MICS

4

Errors in surveys

Two types of errors in surveys

Sampling errors

Non-sampling errors

Page 5: Overview of Data Quality Issues  in MICS

5

Sampling error

The difference between estimate and true value caused because the survey questions a sample of respondents rather than the whole population.

Page 6: Overview of Data Quality Issues  in MICS

6

Non-sampling errors

Other types of errors, due to any stage of the survey process other than the sample design, including Management decisions Data processing Fieldwork performance, etc

All survey stages are interconnected and play roles in non-sampling errors

Page 7: Overview of Data Quality Issues  in MICS

7

Control of error in surveys

Sampling errors can be estimated before data collection, and measured after data collection

Non-sampling errors are more difficult to control and/or identify

Page 8: Overview of Data Quality Issues  in MICS

8

Minimizing non-sampling errors in MICS

MICS has a series of recommendations for quality assurance, including: Roles and responsibilities of fieldwork teams Easy-to-use data processing programs Training length and content Editing and supervision guidelines Survey tools

Failure to comply with principles behind these recommendations leads to problems in data quality

Page 9: Overview of Data Quality Issues  in MICS

9

MICS data quality survey tools

Survey tools to monitor and improve quality, assess quality, identify non-sampling errors: Field check tables to quantitatively identify

non-sampling errors during data collection and to improve quality• Possible with simultaneous data entry, when data

collection is not too rapid Data quality tables to be produced at the

time of final report

Page 10: Overview of Data Quality Issues  in MICS

10

Data quality tables

A total of 28 tables Data quality tables to look at:

Departures from expected (demographic, biological etc) patterns

Departures from recommended procedures Internal consistency Completeness Indicators of performance

Page 11: Overview of Data Quality Issues  in MICS

11

DQ.1 Age Distribution of Household Population

Deficit at ages 0-1?

Heaping at age 5?

Overall quality - heaping

Deficit – males AND females? More heaping at

age 50 for females than

males

Page 12: Overview of Data Quality Issues  in MICS

12

DQ2. Age Distribution of Eligible and Interviewed Women

Household population of

women age 10-54

Percentage of eligible women

interviewed (Completion

rate)

Number Number Percent10-14 450 . . .

15-19 522 507 15.2 97.1

20-24 552 537 16.1 97.2

25-29 496 478 14.3 96.4

30-34 507 487 14.6 95.9

35-39 459 444 13.3 96.7

40-44 421 415 12.4 98.6

45-49 484 469 14.1 96.9

50-54 441 . . .

Total (15-49) 3441 3336 100.0 96.9

Interviewed women age 15-49

Age

Low response rates for women

at young ages

Surplus at age 50-54?

Page 13: Overview of Data Quality Issues  in MICS

13

DQ3. Age Distribution of Eligible and Interviewed Men

Household population of men

age 10-54Number Number Percent

Age10-14 347 na na na15-19 344 323 9.9 94.020-24 422 407 12.4 96.325-29 586 566 17.3 96.630-34 599 571 17.4 95.335-39 469 443 13.5 94.640-44 447 434 13.3 97.245-49 545 531 16.2 97.450-54 463 na na na

Total (15-49) 3411 3275 100.0 96.0

Ratio of 50-54 to 45-49 0.85

Interview ed men age 15-49

Percentage of eligible men interview ed

(Completion rate)

Low response rates for men at

young ages

Surplus at age 50-54?

Might also want to look at the

number eligible/number in the household

list, by age

Page 14: Overview of Data Quality Issues  in MICS

14

DQ.4 Age Distribution of Children

Household population of children 0-7 years

Number Number Percent0 4621 3966 19.0 85.8

1 4552 4102 19.6 90.1

2 4660 4177 20.0 89.6

3 4760 4329 20.7 91.0

4 4837 4347 20.8 89.9

5 4852 . . .

6 5637 . . .

7 5919 . . .

Total (0-4) 23430 20922 100.0 89.3

Age

Under-5s with completed interviews

Percentage of eligible under-5s interviewed (Completion

rate)

Low response rates for infants?

Out-transference?

Out-transference?

Page 15: Overview of Data Quality Issues  in MICS

15

DQ.5 Birth Date Reporting, Household Population

Year and month of birth Year of birth only

Month of birth only Both missing

Total 99.2 .7 .0 .1 100.0

0-4 100.0 .0 .0 .0 100.0

5-14 99.9 .1 .0 .0 100.0

15-24 99.9 .1 .0 .0 100.0

25-49 99.8 .2 .0 .0 100.0

50-64 99.0 .9 .0 .1 100.0

65-84 96.1 3.9 .0 .0 100.0

85+ 89.1 10.9 .0 .0 100.0

DK/missing 22.2 .0 .0 77.8 100.0

Completeness of reporting of month and year of birth

Total

Age

Is the inclusion of question on date of birth justified?

Page 16: Overview of Data Quality Issues  in MICS

16

DQ.5 Birth Date Reporting, Household Population

Year and month of birth Year of birth onlyMonth of birth

only Both missingTotal 51.3 15.9 1.3 31.5 100.0

0-4 96.9 1.8 .2 1.1 100.0

5-14 79.1 8.9 1.8 10.1 100.0

15-24 55.0 15.8 2.3 26.9 100.0

25-49 32.8 22.4 1.0 43.8 100.0

50-64 18.0 23.0 .5 58.6 100.0

65-84 12.9 20.8 .2 66.1 100.0

85+ 7.3 17.5 .0 75.2 100.0

DK/missing 10.5 15.8 .0 73.7 100.0

Age

Completeness of reporting of month and year of birth

Total

Is the inclusion of question on date of birth justified?

Page 17: Overview of Data Quality Issues  in MICS

17

DQ.6 to DQ.9

Birth Date and Age Reporting for women, men, under-5, and children, adolescents and young people – same structure

DQ.6: Birth date and age reporting: WomenPercent distribution of women age 15-49 years by completeness of date of birth/age information, Country, Year

 

Completeness of reporting of date of birth and age

Total

Number of women age 15-49 years

Year and month of birth

Year of birth and age

Year of birth only Age only Other/DK/Missing

   Total 100.0     Region  

Region 1 100.0  Region 2 100.0  Region 3 100.0  

Region 4 100.0  Region 5 100.0  

Area  

Urban 100.0  Rural           100.0  

More important to have full birth dates for individual respondents, adolescents, young people

Page 18: Overview of Data Quality Issues  in MICS

18

DQ.6 to DQ.9

Year and month of birth

Year of birth only

Completed years since first

birth onlyOther/DK/Missing

Both month and year Year only

Other/DK/Missing

Total 76.8 8.4 12.0 2.9 100.0 98.1 1.6 .3 100.0

Urban 82.5 7.1 8.3 2.1 100.0 97.9 1.7 .5 100.0

Rural 75.7 8.6 12.7 3.0 100.0 98.1 1.5 .3 100.0

Area

Completeness of reporting of date of birth

Date of first birth

Total

Date of last birth

Total

Target for these columns should be 100 per cent – especially for date of last birth, as it concerns eligibility, and is a very recent occurrence

Page 19: Overview of Data Quality Issues  in MICS

19

DQ.11 Completeness of Reporting

In general, target is to keep incomplete (missing, DK, etc) below 5 per cent

Not for all types of information – especially those that relate to eligibility

Page 20: Overview of Data Quality Issues  in MICS

20

DQ.11 to DQ.13

Quality of anthropometric measurements Proportion measured Outliers Incomplete date of birth

DQ.12: Completeness of information for anthropometric indicators: UnderweightPercent distribution of children under 5 by completeness of information on date of birth and weight, Country, Year

 

Valid weight and date of

birth

Reason for exclusion from analysis

Total

Percent of children excluded

from analysisNumber of

children under 5Weight not measured Incomplete date of birth

Weight not measured and incomplete date of

birthFlagged cases

(outliers)   Total 100.0     Age  

<6 months 100.0  6-11 months 100.0  12-23 months 100.0  24-35 months 100.0  36-47 months 100.0  48-59 months           100.0    

Page 21: Overview of Data Quality Issues  in MICS

21

DQ.12 Quality of underweight data

Height not measured

Incomplete date of birth

Both Outliers Total0

5

10

15

20

25

Should we actually use this

data?Children excluded due to non-

response or even incomplete date of birth may not be biased, but outliers is a big

problem

Page 22: Overview of Data Quality Issues  in MICS

22

DQ.13 Quality of stunting data

Height not measured Incomplete date of birth

Both Outliers Total0

5

10

15

20

25

Should we actually use this

data?

Page 23: Overview of Data Quality Issues  in MICS

23

DQ.14 Quality of wasting data

Weight not

measuredLength/Height not measured

Weight and

length/height not

measure

Flagged cases

(outliers)Total 93.0 .1 1.7 3.6 .9 100.0 6.2

<6 month

92.2 .1 1.5 2.7 3.4 100.0 7.7

6-11 month

96.3 .0 .5 2.1 1.0 100.0 3.5

12-23 month

95.5 .0 1.2 2.4 .5 100.0 4.1

24-35 month

91.6 .0 3.4 3.6 .9 100.0 8.0

36-47 month

91.7 .1 1.6 5.2 .4 100.0 7.2

48-59 month

92.2 .1 1.2 4.4 .5 100.0 6.1

Age

Valid weight and

length/height

Reason for exclusion from analysis

Total

Percent of children excluded

from analysis

Number of

children under 5

Good data?

Page 24: Overview of Data Quality Issues  in MICS

24

DQ.15 Heaping in anthropometric measurements

Total 100.0 100.00 11.4 16.81 9.6 9.12 9.5 11.33 9.9 11.54 9.8 9.75 10.4 13.06 10.4 8.97 9.4 6.98 9.9 6.49 9.7 6.30 or 5 21.8 29.8

Digits

Some heaping for height/length

Page 25: Overview of Data Quality Issues  in MICS

25

DQ.15 Heaping in anthropometric measurements

0 1 2 3 4 5 6 7 8 9.0

5.0

10.0

15.0

20.0

25.0

30.0

Weight Height

Usually, more heaping observed in length/height measurements

than weight

Page 26: Overview of Data Quality Issues  in MICS

26

DQ.16 to DQ.18

Observations of birth certificates, vaccination cards and women’s health cards

Two “indicators” of data quality: Performance of interviewers Quality of information the survey collected

Page 27: Overview of Data Quality Issues  in MICS

27

DQ.18 Women’s health cards

Seen by the

interviewer (1)

Not seen by the interviewer (2)

Total 32.1 26.8 39.5 1.5 100.0 40.4 7866

Urban 28.0 30.5 39.2 2.3 100.0 43.8 1280

Rural 32.9 26.1 39.6 1.4 100.0 39.7 6586

Poorest 39.8 21.4 37.0 1.8 100.0 36.6 2219

Second 32.6 29.5 37.0 .9 100.0 44.4 1672

Middle 29.9 27.3 40.9 1.8 100.0 40.0 1490

Fourth 28.8 27.4 42.2 1.6 100.0 39.4 1297

Richest 23.4 31.9 43.3 1.4 100.0 42.4 1188

Number of women with a live birth in the last two

years

Area

Wealth index quintile

Woman does

not have health card

Woman has health card

Missing/DK Total

Percent of health cards seen by the interviewer

(1)/(1+2)*100

In all three tables, look for the

proportion of existing documents the

interviewers were able to see – as a

performance indicator

Also look for the proportion of

documents observed out of all under-5s or

women – if these documents contain

better quality information, that

would be an indicator of overall quality of

the data

Page 28: Overview of Data Quality Issues  in MICS

28

DQ.19 Observation of bednets and places for handwashing

Observation of places for

handwashing: Observed

Place for handwashing not in dwelling

No permission to see Other Total

Total 81.0 18.0 .2 .8 100.0

Urban 83.7 15.5 .5 .4 100.0

Rural 80.5 18.5 .1 .8 100.0

Poorest 69.7 29.1 .2 1.0 100.0

Second 80.8 18.0 .1 1.0 100.0

Middle 84.3 14.9 .1 .7 100.0

Fourth 86.7 12.5 .2 .6 100.0

Richest 90.7 8.7 .4 .2 100.0

Area

Wealth index quintile

Added complication of “moving kettles”

Page 29: Overview of Data Quality Issues  in MICS

29

DQ.20 Person interviewed for the under-5 questionnaire

Mother interview ed

Father interview ed

Other adult female

interview ed

Other adult male

interview edFather

interview ed

Other adult female

interview ed

Other adult male

interview ed

Total 100.0

Age0 100.0

1 100.0

2 100.0

3 100.0

4 100.0

DQ.20: Presence of mother in the household and the person interviewed for the under-5 questionnaireDistribution of children under f ive by w hether the mother lives in the same household, and the person w ho w as interview ed for the under-5 questionnaire, Country, Year

Mother in the household Mother not in the household

Total

Number of children under 5

Universally good data

Page 30: Overview of Data Quality Issues  in MICS

30

DQ.21 Random selection of children

0

10

20

30

40

50

60

70

80

90

100

89.696.6 99.7 98.2 100 99.8

Very significant improvement in the

proportion of children correctly selected

Page 31: Overview of Data Quality Issues  in MICS

31

DQ.22 School attendance by single age

1 2 3 4 5 6 1 2 3 4 5 6

Age at beginning of school year5 100.06 100.07 100.08 100.09 100.010 100.011 100.012 100.013 100.014 100.015 100.016 100.017 100.018 100.019 100.020 100.021 100.022 100.023 100.024 100.0

DQ.22: School attendance by single age

Preschool

Currently attending

Number of household members

Primary schoolGrade

Secondary schoolGrade

Distribution of household population age 5-24 years by educational level and and grade attended in the current (or most recent) school year, Country, Year

DK/MissingNot attending

schoolHigher than secondary Total

Cases should fall on the diagonal – look for

outliers!

Page 32: Overview of Data Quality Issues  in MICS

32

DQ.23 Sex ratio at birth

Sons DaugthersSex ratio at birth Sons Daugthers Sex ratio Sons Daugthers Sex ratio

Total

Age15-1920-2425-2930-3435-3940-4445-49

DQ.23: Sex ratio at birth among children ever born and livingSex ratio (number of males per 100 females) among children ever born (at birth), children living, and deceased children, by age of w omen, Country, Year

Children Ever Born Children Living Children DeceasedNumber of

w omen

Should be around 1.02 to 1.06

Sex ratios among living children should

be lower than for children deceased

Page 33: Overview of Data Quality Issues  in MICS

33

DQ.24 to DQ.26

Tables on the quality of information collected in birth histories

Page 34: Overview of Data Quality Issues  in MICS

34

DQ.24 Births by calendar years

Living Deceased Total Living Deceased Total Living Deceased Total Living Deceased Total

Total na na na

Year of birth2013a na na na2012 na na na2011201020092008200720062005200320022009-2012 na na na2004-2008 na na na1999-2003 na na na1994-1998 na na na<1994 na na naDK/missing na na na

DQ.24: Births by calendar yearsNumber of births, percentage w ith complete birth date, sex ratio at birth, and calendar year ratio by calendar year, according to living, deceased, and total children (w eighted, unimputed), as reported in the birth histories, Country , Year

Number of births Percent w ith complete birth date b Sex ratio at birthc Calendar year ratiod

Important data quality indicator for

birth histories

Page 35: Overview of Data Quality Issues  in MICS

35

Check heaping – multiples of 7, days 0 and 1

Percent early neonatal should increase by period

Compare with global numbers, earlier surveys

0–4 5–9 10–14 15–19

Age at death (days)0123456789101112131415161718192021222324252627282930

Total 0–30 days

Percent early neonatala

DQ.25: Reporting of age at death in daysDistribution of reported deaths under one month of age by age at death in days and the percentage of neonatal deaths reported to occur at ages 0–6 days, by 5-year periods preceding the survey (w eighted, imputed), Country , Year

Number of years preceding the survey Total(0–19)

Page 36: Overview of Data Quality Issues  in MICS

36

0–4 5–9 10–14 15–19

Age at death (months)0a

1234567891011121314151617181920212223

Total 0–11 months

Percent neonatalb

DQ.26: Reporting of age at death in monthsDistribution of reported deaths under tw o years of age by age at death in months and the percentage of infant deaths reported to occur at age under one month, for the 5-year periods of birth preceding the survey (w eighted, imputed), Country , Year

Number of years preceding the survey Total(0-19)

Check heaping – especially at 12 months

Percent neonatal should increase by period

Compare with global numbers, earlier surveys

Page 37: Overview of Data Quality Issues  in MICS

37

DQ.27 Completeness of information on siblings

Number Percent Number Percent Number Percent

Survival status of siblingsLivingDeadDK/MissingTotal 100.0 100.0 100.0

Age of living siblingsReportedDK/MissingTotal 100.0 100.0 100.0

Age at death and years since death for siblings who have diedBoth reportedOnly years since death reportedOnly age at death reportedDK/Missing bothTotal 100.0 100.0 100.0

DQ.27: Completeness of information on siblingsCompleteness of information on the survival status of (all) siblings and age of living siblings reported by interview ed w omen, and age at death and years since death of siblings w ho have died (unw eighted), Country, Year

Sisters Brothers All siblings

Missing information

Missing information

Missing information

Page 38: Overview of Data Quality Issues  in MICS

38

DQ.27 and DQ.28

Mean sibship sizea

Sex ratio of siblings at

birthb

Number of w omen age 15-

49 years

Total

Age15-1920-2425-2930-3435-3940-4445-49

DQ.28: Sibship size and sex ratio of siblingsMean sibship size and sex ratio of siblings at birth, Country, Year

a Includes the respondentb Excludes the respondent

Mean sibship size should be increasing

with age, due to falling fertility

Look for sex ratios within normal ranges

Page 39: Overview of Data Quality Issues  in MICS

39

Thank You