2016 aapor michael wild

13
Item response time in household surveys in developing countries: A multinational, multiregional and multicultural perspective. Presentation for the AAOPR 2016, Austin, Texas Michael Wild, Michael Lokshin

Upload: martin-wulfe

Post on 15-Apr-2017

25 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: 2016 aapor michael wild

Item response time in household surveys in developing countries: A multinational, multiregional and multicultural

perspective.

Presentation for the AAOPR 2016, Austin, Texas

Michael Wild, Michael Lokshin

Page 2: 2016 aapor michael wild

Paradata for quality control

• Paradata can act as a means for quality control.

• So far paradata was hardly accessible for large scale face-to-face surveys in developing countries.

• However developing countries depend much stronger on survey data than the developed world, as it is very often the only source about the socio-economic conditions of their population.

• Additional challenges which are present in this context: Low skill environment in terms of respondents and survey staff, difficult logistics and strong cultural components.

Page 3: 2016 aapor michael wild

Paradate across countries

• The World Bank runs several standardized surveys across different countries and regions, i.e. Living Standards Measurement Survey (LSMS).

• Other international organizations run similar programs, i.e. UNICEF and the Multiple Indicator Cluster Survey (MICS) or USAID the Demographic and Health Survey

• Questionnaires have standardized items (which allow for local modifications), certain standardization in sample designs and partially standardized tabulation plans.

• Most of this surveys are carried out with National Statistical Agencies, with strongly heterogenous skill levels, and thus strong variation in the actually collected data.

Page 4: 2016 aapor michael wild

The World Bank’s Survey Solutions CAPI package and the collection of para-data • Survey Solutions is a CAPI package, developed by the DEC-SM department.

• It consists of 3 components: • The questionnaire designer • The survey management console • The data collection application running on Android OS

• Survey Solutions collects the following para-data about the survey process: • Time per question • Number of answer changes • Number of activated error messages • Use of additional information fields provided for specific questions • Type of question

• In addition with every technical assistance we provide, we now also collect interviewer characteristics, which we use in the analysis of the survey process.

9/7/2016

Page 5: 2016 aapor michael wild

Paradata Attributes

• Paradata in general difficult to analyse because • Unstructured

• Large volume

• Can be considered as Big Data

• Besides this, no clear definition on which type of data, how to be measured etc.

• As such different survey systems create different types of data

Page 6: 2016 aapor michael wild

Paradata harmonization

• The World Bank operates in almost all developing and transition countries, and conducts a lot of surveys in Technical Assistance programs to National Statistical Agencies.

• Survey Solutions has proven to work well challenging low skill environments and is freely available

• A collateral of these two aspects in combination is the standardization of paradata by the use of the Survey Solutions CAPI system.

• With these standardized items at hand we are enabled, to use paradata in combination with the already questionnaire harmonized items, to define measures of quality (control), as well as standards for sample representativeness or efficient survey operations

Page 7: 2016 aapor michael wild

Paradata from 3 different surveys in 3 different countries

ZAMBIA: National Disability Survey

Sample size 9126

Interviewers 48

Paradata 2180316

UGANDA: Household and plot measurement survey

Sample size 920

Interviewers 18

Paradata 64781 BENIN: Midline household establishment survey (impact evaluation)

Sample size 1810

Interviewers 34

Paradata 466429

Page 8: 2016 aapor michael wild

Response time distributions (ZAMBIA)

Page 9: 2016 aapor michael wild

Response time distribution (UGANDA)

Page 10: 2016 aapor michael wild

Response time distribution (BENIN)

Page 11: 2016 aapor michael wild

Models used for the estimation

• A standard regression model

• 𝑦𝑖 = 𝛼0 + 𝛽𝑖𝑥𝑖 + 𝜀𝑖

• A two stage random intercept model

• 𝑦𝑖𝑗 = 𝛼0 + 𝛽𝑖𝑥𝑖𝑗 + 𝑢𝑗 + 𝜀𝑖𝑗

• A three stage random intercept model

• 𝑦𝑖𝑗𝑘 = 𝛼0 + 𝛽𝑖𝑥𝑖 + 𝑢𝑗 + 𝑣𝑘 + 𝜀𝑖𝑗𝑘 , with 𝑢𝑗~𝑁(0, 𝜎𝑢2) and 𝑣𝑘~𝑁(0, 𝜎𝑘

2)

• All models have been extended for a random slope coefficient.

• Test used for model fit was a standard LR test, and tested linear against intercept, and intercept against slope.

9/7/2016

Page 12: 2016 aapor michael wild

Results

Zambia Uganda Benin

word_count 0.0023625 *** -0.0018296 0.079174 ***

qsteps 0.0003454 *** 0.0052885 *** 0.000182 ***

numberHHmember -0.0001542 -0.0093524 *** 0.029101

sex 0.0385134 *** -0.0101528 -0.05842

age 0.0070146 *** 0.0119343 *** 0.003752

educ -0.0115035 0.2059003 *** -0.03 ***

fulltime_interviewer -0.0155306 -0.2145323 * -0.05

cai -0.0163727 -0.3128348 *** -0.03936

text 1.564725 *** 1.135966 *** 0.504315 ***

numeric 0.5433497 *** 0.5687459 *** 0.708689 ***

_cons 1.307655 *** 1.532505 *** 1.306284 *

ICC

Cluster 0.03 0.07 0.04

Interviewer 0.05 0.08 0.03

Respondent 0.08 0.11 0.06

Page 13: 2016 aapor michael wild

Conclusions and Outlook

• As can be seen from the prior analysis of selected harmonized items, there are comparable effects.

• Through the harmonization of paradata surveys can be compared in terms of quality across countries.

• Besides the comparability it allows also for the development of applications which facilitate the analysis of paradata.

• By standardizing questionnaire items this way it will be possible, to provide guidance on questionnaire administration.