overview of data quality issues in mics
DESCRIPTION
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and Dissemination Workshop. Overview of Data Quality Issues in MICS. Data quality in MICS. Important to maintain data of the highest possible quality! - PowerPoint PPT PresentationTRANSCRIPT
Multiple Indicator Cluster Surveys Data Interpretation, Further Analysis and
Dissemination Workshop
Overview of Data Quality Issues in MICS
2
Data quality in MICS
Important to maintain data of the highest possible quality!
Important to examine data quality carefully before/during the interpretation of survey findings
3
Looking at data quality – Why?
Confidence in survey results
Identify limitations in results Inform dissemination and policy formulation,
avoid misleading policy makers, third parties
ALL SURVEYS ARE SUBJECT TO ERRORS
4
Errors in surveys
Two types of errors in surveys
Sampling errors
Non-sampling errors
5
Sampling error
The difference between estimate and true value caused because the survey questions a sample of respondents rather than the whole population.
6
Non-sampling errors
Other types of errors, due to any stage of the survey process other than the sample design, including Management decisions Data processing Fieldwork performance, etc
All survey stages are interconnected and play roles in non-sampling errors
7
Control of error in surveys
Sampling errors can be estimated before data collection, and measured after data collection
Non-sampling errors are more difficult to control and/or identify
8
Minimizing non-sampling errors in MICS
MICS has a series of recommendations for quality assurance, including: Roles and responsibilities of fieldwork teams Easy-to-use data processing programs Training length and content Editing and supervision guidelines Survey tools
Failure to comply with principles behind these recommendations leads to problems in data quality
9
MICS data quality survey tools
Survey tools to monitor and improve quality, assess quality, identify non-sampling errors: Field check tables to quantitatively identify
non-sampling errors during data collection and to improve quality• Possible with simultaneous data entry, when data
collection is not too rapid Data quality tables to be produced at the
time of final report
10
Data quality tables
A total of 28 tables Data quality tables to look at:
Departures from expected (demographic, biological etc) patterns
Departures from recommended procedures Internal consistency Completeness Indicators of performance
11
DQ.1 Age Distribution of Household Population
Deficit at ages 0-1?
Heaping at age 5?
Overall quality - heaping
Deficit – males AND females? More heaping at
age 50 for females than
males
12
DQ2. Age Distribution of Eligible and Interviewed Women
Household population of
women age 10-54
Percentage of eligible women
interviewed (Completion
rate)
Number Number Percent10-14 450 . . .
15-19 522 507 15.2 97.1
20-24 552 537 16.1 97.2
25-29 496 478 14.3 96.4
30-34 507 487 14.6 95.9
35-39 459 444 13.3 96.7
40-44 421 415 12.4 98.6
45-49 484 469 14.1 96.9
50-54 441 . . .
Total (15-49) 3441 3336 100.0 96.9
Interviewed women age 15-49
Age
Low response rates for women
at young ages
Surplus at age 50-54?
13
DQ3. Age Distribution of Eligible and Interviewed Men
Household population of men
age 10-54Number Number Percent
Age10-14 347 na na na15-19 344 323 9.9 94.020-24 422 407 12.4 96.325-29 586 566 17.3 96.630-34 599 571 17.4 95.335-39 469 443 13.5 94.640-44 447 434 13.3 97.245-49 545 531 16.2 97.450-54 463 na na na
Total (15-49) 3411 3275 100.0 96.0
Ratio of 50-54 to 45-49 0.85
Interview ed men age 15-49
Percentage of eligible men interview ed
(Completion rate)
Low response rates for men at
young ages
Surplus at age 50-54?
Might also want to look at the
number eligible/number in the household
list, by age
14
DQ.4 Age Distribution of Children
Household population of children 0-7 years
Number Number Percent0 4621 3966 19.0 85.8
1 4552 4102 19.6 90.1
2 4660 4177 20.0 89.6
3 4760 4329 20.7 91.0
4 4837 4347 20.8 89.9
5 4852 . . .
6 5637 . . .
7 5919 . . .
Total (0-4) 23430 20922 100.0 89.3
Age
Under-5s with completed interviews
Percentage of eligible under-5s interviewed (Completion
rate)
Low response rates for infants?
Out-transference?
Out-transference?
15
DQ.5 Birth Date Reporting, Household Population
Year and month of birth Year of birth only
Month of birth only Both missing
Total 99.2 .7 .0 .1 100.0
0-4 100.0 .0 .0 .0 100.0
5-14 99.9 .1 .0 .0 100.0
15-24 99.9 .1 .0 .0 100.0
25-49 99.8 .2 .0 .0 100.0
50-64 99.0 .9 .0 .1 100.0
65-84 96.1 3.9 .0 .0 100.0
85+ 89.1 10.9 .0 .0 100.0
DK/missing 22.2 .0 .0 77.8 100.0
Completeness of reporting of month and year of birth
Total
Age
Is the inclusion of question on date of birth justified?
16
DQ.5 Birth Date Reporting, Household Population
Year and month of birth Year of birth onlyMonth of birth
only Both missingTotal 51.3 15.9 1.3 31.5 100.0
0-4 96.9 1.8 .2 1.1 100.0
5-14 79.1 8.9 1.8 10.1 100.0
15-24 55.0 15.8 2.3 26.9 100.0
25-49 32.8 22.4 1.0 43.8 100.0
50-64 18.0 23.0 .5 58.6 100.0
65-84 12.9 20.8 .2 66.1 100.0
85+ 7.3 17.5 .0 75.2 100.0
DK/missing 10.5 15.8 .0 73.7 100.0
Age
Completeness of reporting of month and year of birth
Total
Is the inclusion of question on date of birth justified?
17
DQ.6 to DQ.9
Birth Date and Age Reporting for women, men, under-5, and children, adolescents and young people – same structure
DQ.6: Birth date and age reporting: WomenPercent distribution of women age 15-49 years by completeness of date of birth/age information, Country, Year
Completeness of reporting of date of birth and age
Total
Number of women age 15-49 years
Year and month of birth
Year of birth and age
Year of birth only Age only Other/DK/Missing
Total 100.0 Region
Region 1 100.0 Region 2 100.0 Region 3 100.0
Region 4 100.0 Region 5 100.0
Area
Urban 100.0 Rural 100.0
More important to have full birth dates for individual respondents, adolescents, young people
18
DQ.6 to DQ.9
Year and month of birth
Year of birth only
Completed years since first
birth onlyOther/DK/Missing
Both month and year Year only
Other/DK/Missing
Total 76.8 8.4 12.0 2.9 100.0 98.1 1.6 .3 100.0
Urban 82.5 7.1 8.3 2.1 100.0 97.9 1.7 .5 100.0
Rural 75.7 8.6 12.7 3.0 100.0 98.1 1.5 .3 100.0
Area
Completeness of reporting of date of birth
Date of first birth
Total
Date of last birth
Total
Target for these columns should be 100 per cent – especially for date of last birth, as it concerns eligibility, and is a very recent occurrence
19
DQ.11 Completeness of Reporting
In general, target is to keep incomplete (missing, DK, etc) below 5 per cent
Not for all types of information – especially those that relate to eligibility
20
DQ.11 to DQ.13
Quality of anthropometric measurements Proportion measured Outliers Incomplete date of birth
DQ.12: Completeness of information for anthropometric indicators: UnderweightPercent distribution of children under 5 by completeness of information on date of birth and weight, Country, Year
Valid weight and date of
birth
Reason for exclusion from analysis
Total
Percent of children excluded
from analysisNumber of
children under 5Weight not measured Incomplete date of birth
Weight not measured and incomplete date of
birthFlagged cases
(outliers) Total 100.0 Age
<6 months 100.0 6-11 months 100.0 12-23 months 100.0 24-35 months 100.0 36-47 months 100.0 48-59 months 100.0
21
DQ.12 Quality of underweight data
Height not measured
Incomplete date of birth
Both Outliers Total0
5
10
15
20
25
Should we actually use this
data?Children excluded due to non-
response or even incomplete date of birth may not be biased, but outliers is a big
problem
22
DQ.13 Quality of stunting data
Height not measured Incomplete date of birth
Both Outliers Total0
5
10
15
20
25
Should we actually use this
data?
23
DQ.14 Quality of wasting data
Weight not
measuredLength/Height not measured
Weight and
length/height not
measure
Flagged cases
(outliers)Total 93.0 .1 1.7 3.6 .9 100.0 6.2
<6 month
92.2 .1 1.5 2.7 3.4 100.0 7.7
6-11 month
96.3 .0 .5 2.1 1.0 100.0 3.5
12-23 month
95.5 .0 1.2 2.4 .5 100.0 4.1
24-35 month
91.6 .0 3.4 3.6 .9 100.0 8.0
36-47 month
91.7 .1 1.6 5.2 .4 100.0 7.2
48-59 month
92.2 .1 1.2 4.4 .5 100.0 6.1
Age
Valid weight and
length/height
Reason for exclusion from analysis
Total
Percent of children excluded
from analysis
Number of
children under 5
Good data?
24
DQ.15 Heaping in anthropometric measurements
Total 100.0 100.00 11.4 16.81 9.6 9.12 9.5 11.33 9.9 11.54 9.8 9.75 10.4 13.06 10.4 8.97 9.4 6.98 9.9 6.49 9.7 6.30 or 5 21.8 29.8
Digits
Some heaping for height/length
25
DQ.15 Heaping in anthropometric measurements
0 1 2 3 4 5 6 7 8 9.0
5.0
10.0
15.0
20.0
25.0
30.0
Weight Height
Usually, more heaping observed in length/height measurements
than weight
26
DQ.16 to DQ.18
Observations of birth certificates, vaccination cards and women’s health cards
Two “indicators” of data quality: Performance of interviewers Quality of information the survey collected
27
DQ.18 Women’s health cards
Seen by the
interviewer (1)
Not seen by the interviewer (2)
Total 32.1 26.8 39.5 1.5 100.0 40.4 7866
Urban 28.0 30.5 39.2 2.3 100.0 43.8 1280
Rural 32.9 26.1 39.6 1.4 100.0 39.7 6586
Poorest 39.8 21.4 37.0 1.8 100.0 36.6 2219
Second 32.6 29.5 37.0 .9 100.0 44.4 1672
Middle 29.9 27.3 40.9 1.8 100.0 40.0 1490
Fourth 28.8 27.4 42.2 1.6 100.0 39.4 1297
Richest 23.4 31.9 43.3 1.4 100.0 42.4 1188
Number of women with a live birth in the last two
years
Area
Wealth index quintile
Woman does
not have health card
Woman has health card
Missing/DK Total
Percent of health cards seen by the interviewer
(1)/(1+2)*100
In all three tables, look for the
proportion of existing documents the
interviewers were able to see – as a
performance indicator
Also look for the proportion of
documents observed out of all under-5s or
women – if these documents contain
better quality information, that
would be an indicator of overall quality of
the data
28
DQ.19 Observation of bednets and places for handwashing
Observation of places for
handwashing: Observed
Place for handwashing not in dwelling
No permission to see Other Total
Total 81.0 18.0 .2 .8 100.0
Urban 83.7 15.5 .5 .4 100.0
Rural 80.5 18.5 .1 .8 100.0
Poorest 69.7 29.1 .2 1.0 100.0
Second 80.8 18.0 .1 1.0 100.0
Middle 84.3 14.9 .1 .7 100.0
Fourth 86.7 12.5 .2 .6 100.0
Richest 90.7 8.7 .4 .2 100.0
Area
Wealth index quintile
Added complication of “moving kettles”
29
DQ.20 Person interviewed for the under-5 questionnaire
Mother interview ed
Father interview ed
Other adult female
interview ed
Other adult male
interview edFather
interview ed
Other adult female
interview ed
Other adult male
interview ed
Total 100.0
Age0 100.0
1 100.0
2 100.0
3 100.0
4 100.0
DQ.20: Presence of mother in the household and the person interviewed for the under-5 questionnaireDistribution of children under f ive by w hether the mother lives in the same household, and the person w ho w as interview ed for the under-5 questionnaire, Country, Year
Mother in the household Mother not in the household
Total
Number of children under 5
Universally good data
30
DQ.21 Random selection of children
0
10
20
30
40
50
60
70
80
90
100
89.696.6 99.7 98.2 100 99.8
Very significant improvement in the
proportion of children correctly selected
31
DQ.22 School attendance by single age
1 2 3 4 5 6 1 2 3 4 5 6
Age at beginning of school year5 100.06 100.07 100.08 100.09 100.010 100.011 100.012 100.013 100.014 100.015 100.016 100.017 100.018 100.019 100.020 100.021 100.022 100.023 100.024 100.0
DQ.22: School attendance by single age
Preschool
Currently attending
Number of household members
Primary schoolGrade
Secondary schoolGrade
Distribution of household population age 5-24 years by educational level and and grade attended in the current (or most recent) school year, Country, Year
DK/MissingNot attending
schoolHigher than secondary Total
Cases should fall on the diagonal – look for
outliers!
32
DQ.23 Sex ratio at birth
Sons DaugthersSex ratio at birth Sons Daugthers Sex ratio Sons Daugthers Sex ratio
Total
Age15-1920-2425-2930-3435-3940-4445-49
DQ.23: Sex ratio at birth among children ever born and livingSex ratio (number of males per 100 females) among children ever born (at birth), children living, and deceased children, by age of w omen, Country, Year
Children Ever Born Children Living Children DeceasedNumber of
w omen
Should be around 1.02 to 1.06
Sex ratios among living children should
be lower than for children deceased
33
DQ.24 to DQ.26
Tables on the quality of information collected in birth histories
34
DQ.24 Births by calendar years
Living Deceased Total Living Deceased Total Living Deceased Total Living Deceased Total
Total na na na
Year of birth2013a na na na2012 na na na2011201020092008200720062005200320022009-2012 na na na2004-2008 na na na1999-2003 na na na1994-1998 na na na<1994 na na naDK/missing na na na
DQ.24: Births by calendar yearsNumber of births, percentage w ith complete birth date, sex ratio at birth, and calendar year ratio by calendar year, according to living, deceased, and total children (w eighted, unimputed), as reported in the birth histories, Country , Year
Number of births Percent w ith complete birth date b Sex ratio at birthc Calendar year ratiod
Important data quality indicator for
birth histories
35
Check heaping – multiples of 7, days 0 and 1
Percent early neonatal should increase by period
Compare with global numbers, earlier surveys
0–4 5–9 10–14 15–19
Age at death (days)0123456789101112131415161718192021222324252627282930
Total 0–30 days
Percent early neonatala
DQ.25: Reporting of age at death in daysDistribution of reported deaths under one month of age by age at death in days and the percentage of neonatal deaths reported to occur at ages 0–6 days, by 5-year periods preceding the survey (w eighted, imputed), Country , Year
Number of years preceding the survey Total(0–19)
36
0–4 5–9 10–14 15–19
Age at death (months)0a
1234567891011121314151617181920212223
Total 0–11 months
Percent neonatalb
DQ.26: Reporting of age at death in monthsDistribution of reported deaths under tw o years of age by age at death in months and the percentage of infant deaths reported to occur at age under one month, for the 5-year periods of birth preceding the survey (w eighted, imputed), Country , Year
Number of years preceding the survey Total(0-19)
Check heaping – especially at 12 months
Percent neonatal should increase by period
Compare with global numbers, earlier surveys
37
DQ.27 Completeness of information on siblings
Number Percent Number Percent Number Percent
Survival status of siblingsLivingDeadDK/MissingTotal 100.0 100.0 100.0
Age of living siblingsReportedDK/MissingTotal 100.0 100.0 100.0
Age at death and years since death for siblings who have diedBoth reportedOnly years since death reportedOnly age at death reportedDK/Missing bothTotal 100.0 100.0 100.0
DQ.27: Completeness of information on siblingsCompleteness of information on the survival status of (all) siblings and age of living siblings reported by interview ed w omen, and age at death and years since death of siblings w ho have died (unw eighted), Country, Year
Sisters Brothers All siblings
Missing information
Missing information
Missing information
38
DQ.27 and DQ.28
Mean sibship sizea
Sex ratio of siblings at
birthb
Number of w omen age 15-
49 years
Total
Age15-1920-2425-2930-3435-3940-4445-49
DQ.28: Sibship size and sex ratio of siblingsMean sibship size and sex ratio of siblings at birth, Country, Year
a Includes the respondentb Excludes the respondent
Mean sibship size should be increasing
with age, due to falling fertility
Look for sex ratios within normal ranges
39
Thank You