data quality assurance in bangladesh demographic & health surveys

27
Data Quality assurance in Bangladesh Demographic & Health Surveys Nitai Chakraborty Professor, Department of Statistics, Biostatistics & Informatics, University of Dhaka 1

Upload: kita

Post on 23-Feb-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Data Quality assurance in Bangladesh Demographic & Health Surveys. Nitai Chakraborty Professor, Department of Statistics, Biostatistics & Informatics, University of Dhaka. Organization of the Presentation. Data Quality Assurance : Concept & coverage - PowerPoint PPT Presentation

TRANSCRIPT

Slide 1

Data Quality assurance in Bangladesh Demographic & Health SurveysNitai ChakrabortyProfessor, Department of Statistics, Biostatistics & Informatics, University of Dhaka

1Organization of the Presentation Data Quality Assurance : Concept & coverage

Components of data quality assurance steps undertaken in Demographic & Health Surveys (DHS) during field implementation & data processing

2Data Quality Assurance: Concept & coverageQuality assurance could be understood as total quality management paradigm that examines the survey process at each step of survey design and implementation to improve the output of the survey in terms of its relevance, accuracy, coherence and comparability.

The major areas of survey where Quality assurance protocol should be strictly administered are:

Selection of survey institution Sampling design & sample size Designing of data collection instruments & testing the instruments Recruitment & training of field staff Data quality control during field implementation & data entry process3Data quality control steps during field implementation & data entry processThe components of data quality assurance steps in most of DHS surveys during field implementation are:

Supervision & monitoring of field work through Quality control teamsMonitoring data quality & performance of field staff through Field -Check tablesDouble entry verificationSecondary editing4Monitoring of field work through Quality control teamsAll DHS Surveys employed Quality control teams to work in the field for the entire duration of the field work, circulating among all teams, to ensure that :The field interviewers are observed closely by the senior staff during first few days of field work and give back their immediate feedback to interviewers.Thoroughly edited all completed questionnaires within a day of the interview or at least before the team leaves the sample cluster. Supervisors and field editors ensure that all questionnaires are thoroughly scrutinized and all errors are tactfully discussed with the interviewer.In most DHS surveys, one of the supervisors responsibilities is to conduct re- interviews with approximately 5 percent of the households covered in the survey. The purpose of the re-interviews is to ensure that interviewers are visiting the selected households and that they do not intentionally leave out eligible household members or misreport their ages so as to reduce their workload.5Monitoring data quality with Field -Check tablesField-check tables are one way of monitoring data quality while the field work is still in progress. All DHS surveys including BMMS used the Census and Survey Processing System (CSPro) software package for data processing. Field-check tables on important aspect of data quality are produced regularly using CSPro data processing application.Use of the field-check tables is crucial especially during early stages of fieldwork when the option remains to retraining of personnel, modify procedures or re-interview problem clustersEach table focuses on an important aspect of data qualityThese Field check tables are run by the supervisors every week starting after entering the first batch of cluster data . As fieldwork progresses and becomes more settled and routine, the checks become bi-weekly6Monitoring data quality with Field -Check tables (cntd.)Table FC-1: Household response rate: Monitors the performance of interview teams in terms of non-response to the household questionnaire. The supervisor should be informed and remedial action is needed if a team or interviewer shows an exceptional pattern of non-response

Table FC-2: Eligible Women per Household One way for interviewers to reduce their workload is to deliberately omit eligible women from the household or to estimate their ages to be either above or below the cutoff ages for eligibility (15-49). Table FC-2 monitors the number of eligible women per household.

7Monitoring data quality with Field -Check tables (cntd.)Table FC-1 Household response rates Percent distribution of sampled households by result of household interview and household response rate, by interviewer teamResult of household interviewTeamCompleted(1)HH present,no resp.(2)House-holdabsent(3)Post-poned(4)Refused(5)Dwellingvacant(6)Dwellingdestroyed(7)Dwellingnotfound(8)Other(9)TotalNumberHouse-holdresponserate (%)*Team 197.00.00.50.61.30.40.10.10.0100.032598.0Team 296.51.00.01.00.90.00.50.10.0100.036597.0Team 387.72.30.03.06.00.10.20.70.0100.034788.0Team 498.20.20.20.70.20.00.30.00.2100.035298.9All teams94.80.90.21.32.10.10.30.20.1100.0 138995.4* HH response rate = (1) / (1+2+4+5+8) * 100 8Monitoring data quality with Field -Check tables (cntd.)Table FC-2 Eligible women per householdMean number of de facto eligible women per household, according to interviewer teamTeamUrbanRuralNumber of completed householdsNumber of eligible women in those HHsMean number of eligible women per HHTarget not metNumber of completed householdsNumber of eligible women in those HHsMean number of eligible women per HHTarget not metTeam 146651.4186730.850.85Team 21722431.412142251.05Team 31391581.141.14821191.45Team 41972361.201201120.930.93Team 51161311.131.131611901.18All teams6708331.246637191.08Note: Number of women expected per HH is country-specific and defined in the sample design (it usually differs by urban/rural areas). The target is the minimum mean number of de facto eligible women per HH expected, and should be 80% of what was expected at the time of sample design. Thus, if 1.2 women per HH was estimated at the time of sample design, teams should be finding a minimum of 0.96 women per HH. Country-managers should provide data processors with the country-specific targets 9Monitoring data quality with Field -Check tables (cntd.)Table FC-3 Age Displacement Examines whether interviewers are intentionally displacing the age of young women from the eligible range (15 and over) to an ineligible age (14 and under). An Age Ratio less than 100 indicates a deficit of woman 15 years old compared with those 14 and 16 years old and might indicate intentional displacementTable FC-4 Similar to FC-3 this table whether interviewers are displacing women over the age eligibility boundary, i.e. from ages less than 50 years old to ages 50 and over. An Age Ratio less than 100 indicates a deficit of women 49 years old compared with those 48 and 50 years old and might indicate intentional displacement

1010Monitoring data quality with Field -Check tables (cntd.)11Table FC-3 Age displacementNumber of all women 12-18 years listed in the household schedule by single years of age and age ratio 15/14 according to interviewer team Age of womenTeam12131415161718TotalAgeratio (women 15/ women 14)Target not metTeam 11011118879640.730.73Team 211111298107680.750.78Team 31212111311119791.18-Team 41216135687670.380.38All teams455047353336322780.740.74Note: Target is an age ratio of women age 15 / women age 14 > 0.8* All women = de facto + de jureMonitoring data quality with Field -Check tables (cntd.)Table FC-5: Birth DisplacementSome interviewers intentionally displace the birth dates of children from the fourth or fifth year to the sixth year before the year of the survey, so as to decrease the length and difficulty of their assigned interviewing task. This practice seriously undermines the quality of the data. Field-check Table 5 measures the performance of interviewers regarding displacement of births from calendar years after the cutoff date ( say January 2004) to before the cutoff date. If significant displacement has occurred, the birth year ratio will be found much lower than 100, which is the observed ratio when a smooth change in the number of births is observed from the year before the cutoff (2003) to the year after the cutoff (2005).

12Monitoring data quality with Field -Check tables (cntd.)13FC-5 Birth displacement Number of births since [2004] by year of birth and birth year ratio [2004/2003], according to interviewer team (based on births of all women)TeamYear of birthTotalBirth year ratio (2004/2003)Target not met 2000200120022003200420052006200720082009MissingTeam 14846524735383636351503880.740.74Team 24037474538413327371213580.84-Team 33647415028313035311313430.560.56Team 44543515126493343281413840.510.51All teams1691731911931271591321411315431,4730.660.72Target=0.8Monitoring data quality with Field -Check tables (cntd.)Table FC-7: Completeness of Date/Age Information for BirthsOne of the main objectives of the survey is to estimate mortality rates for different age groups of children. This is why data are collected on the age at death of deceased children. Interviewers are required to record at least an approximate age at death for all deceased children. Field-check Table 7 monitors the performance of interviewers regarding birth date completeness. The table is divided into two parts, one for surviving and one for deceased children, since information about deceased children is typically less complete.

14Monitoring data quality with Field -Check tables (cntd.)15Table FC-7L Birth date reporting: Living children Percent distribution of surviving births by completeness of date/age information by interviewer teamLIVING CHILDRENCompleteness of reportingTeamYear and month of birth givenYear and ageYear of birth onlyAge onlyOtherNo dataTotalNumberTeam 194.94.00.40.70.00.0100.0450Team 296.91.50.40.40.40.2100.0453Team 395.82.20.02.00.00.0100.0449Team 472.511.26.74.52.22.9100.0448All teams90.14.71.91.90.70.8100.01800Monitoring data quality with Field -Check tables (cntd.)16Table FC-7D Birth date reporting: Deceased children Percent distribution of non-surviving births by completeness of date information by interviewer teamDEAD CHILDRENCompleteness of reportingTeamMonth and year givenYear onlyMonth onlyNo dataTotalNumberTeam 188.010.00.02.0100.050Team 295.74.30.00.0100.047Team 394.15.90.00.0100.051Team 469.219.21.99.7100.052All teams86.510.00.53.0100.0200Monitoring data quality with Field -Check tables (cntd.)Table FC-8: Heaping on age at deathA common problem in the collection of data on age at death is heaping at 12 months of age i.e. a large number of deaths are reported at 12 months relative to the number reported at months 9, 10, and 11, or at months 13, 14, and 15. Such heaping can result in the underestimation of the infant mortality rate (based on deaths in months 0-11) and overestimation of the child mortality rate (based on deaths in months 12-23 and years 2-4).Heaping of deaths at 12 months of age is the result of two frequently encountered interviewing situations.

The first situation occurs when respondents report age at death as "one year", even though the death may have occurred at 10 months, 16 months, etc. Some interviewers will record "1 year" (incorrectly) or (also incorrect) simply convert "1 year" to 12 months and record that without probing.The second situation in which heaping occurs is when a respondent initially reports that she does not the know the age but, when encouraged to recall the age, reports in terms of a preferred number of months (i.e., 12 rather than 11 or 13).

17Monitoring data quality with Field -Check tables (cntd.)18FC-8: Age at death heaping Number of deaths in the 15 years preceding the survey occurring at 8-16 months of age by reported months of age at death (including age at death reported as "one year") and 12 months ratio, according to interviewer team. (Includes deaths for which a calendar period of death could not be assigned because of missing date of birth information. Deaths lacking age at death are not included. Based on births of all women)TeamAge at death (in months)Total 8-16 months (including "1 year")12 months ratio (including "1 year")*Target not met 8 m.9 m.10 m.11 m.12 months13 m.14 m.15 m.16 m.12 m.Reported as 1 yearTeam 110444343221371.71.7Team 2201252452132561.4-Team 3812637110024533.13.1Team 410843253243441.4-Total4836191216258511101901.91.9* 12 months ratio = (deaths at 12 months + deaths reported at "1 year") / ((all deaths 8-16 m. + deaths reported at "1 year") / 9)Monitoring data quality with Field -Check tables (cntd.)Table FC-9 : Underreporting of Infant deaths

Underreporting of births and deceased children seriously undermines data quality. This table is useful in determining whether gross underreporting of infant deaths is occurring. However, there is no certain way to determine whether an individual interviewer or team is omitting births of deceased children, because sampling fluctuations and genuine regional differences can produce differences among teams and individuals that are unrelated to data quality.Generally, if the neonatal to infant mortality ratio falls below 0.45, or is significantly lower in one or more teams relative to the others, then omission of neonatal deaths is suspected. Also, if the infant deaths to total birth ratio is substantially lower in one or more teams than in other teams, then omission in infant deaths is suspected.

19Monitoring data quality with Field -Check tables (cntd.)20Table FC-9 Infant mortality Number of births in the 15 years before the survey by survival status and age at death (for those who died), the ratio of neonatal deaths (< 1 mo.) to all infant deaths (< 12 mos.), and the ratio of infant deaths to all births, according to interviewer team.All birthsRatiosAge at death in months for children who died*Still alive(6)Total births(7)=(5+6)Ratio of neonatal to infant(1)/(1+2)Ratio of infant deaths to total births per thousand(1+2)/(7)Team