measurement and health information systems working paper no.8 11 november 2005 statistical...
DESCRIPTION
Measurement and Health Information Systems WHO Experience Extensive Cognitive testing in cognitive labs with cognitive debriefing protocolsExtensive Cognitive testing in cognitive labs with cognitive debriefing protocols Formal testing of recall periodsFormal testing of recall periods Comparators used in self-reportsComparators used in self-reports Response scalesResponse scalesTRANSCRIPT
Measurement and Health Information Systems
Working Paper No.8
11 November 2005
STATISTICAL COMMISSION and STATISTICAL OFFICE OF THEUN ECONOMIC COMMISSION FOR EUROPEAN COMMUNITIESEUROPE (EUROSTAT)
CONFERENCE OF EUROPEAN WORLD HEALTHSTATISTICIANS ORGANIZATION (WHO)
Joint UNECE/WHO/Eurostat Meetingon the Measurement of Health Status (Budapest, Hungary, 14-16 November 2005)
Session 4 – Invited paper
Issues in international comparisons Issues in international comparisons of health status measurementof health status measurement
Somnath ChatterjiSomnath ChatterjiWHOWHO
Measurement and Health Information Systems
Instrument DevelopmentInstrument Development• review of existing standard instruments review of existing standard instruments • international consultations international consultations • review by key informantsreview by key informants• pilot testingpilot testing• psychometric properties – reliability and validitypsychometric properties – reliability and validity• culturally sensitive ways of ensuring culturally sensitive ways of ensuring
comparability across populationscomparability across populations– within and across countries within and across countries
Measurement and Health Information Systems
WHO ExperienceWHO Experience• Extensive Cognitive testing in cognitive Extensive Cognitive testing in cognitive
labs with cognitive debriefing protocolslabs with cognitive debriefing protocols• Formal testing of recall periodsFormal testing of recall periods• Comparators used in self-reportsComparators used in self-reports• Response scalesResponse scales
Measurement and Health Information Systems
WHO ExperienceWHO Experience• Translation protocolsTranslation protocols
– Creation of a translation workbenchCreation of a translation workbench– Translation by bilingual expert groupTranslation by bilingual expert group– Targeted back-translationTargeted back-translation– Linguistics evaluation with formal feedback Linguistics evaluation with formal feedback
on problem terms / conceptson problem terms / concepts– External review by bilingual expertsExternal review by bilingual experts
Measurement and Health Information Systems
WHO ExperienceWHO Experience• Formal psychometric testingFormal psychometric testing
– Test-rest reliability to establish replicabilityTest-rest reliability to establish replicability– Analysis of psychometric properties across Analysis of psychometric properties across
populationspopulations– Verification of item level missing data by Verification of item level missing data by
populationpopulation– Check of IRT properties of itemsCheck of IRT properties of items– Determination of the most parsimonious Determination of the most parsimonious
set for measurementset for measurement
Measurement and Health Information Systems
• Cross-Population ComparabilityCross-Population Comparability
– Problem of cross-population comparability: self-report categorical response data are Problem of cross-population comparability: self-report categorical response data are notnot necessarily cross-population comparable. necessarily cross-population comparable.
– Problem can be conceptualised in terms of response category cut-points shifts. For the Problem can be conceptualised in terms of response category cut-points shifts. For the samesame level of ability on an underlying domain: respondents from different socio- level of ability on an underlying domain: respondents from different socio-demographic backgrounds respond in demographic backgrounds respond in differentdifferent response categories. response categories.
– This would imply, for example, that someone saying “no problems” with respect to the This would imply, for example, that someone saying “no problems” with respect to the domain of mobility in country A may not mean the same thing as someone saying “no domain of mobility in country A may not mean the same thing as someone saying “no problems” in country B in terms of the underlying latent variable of mobility: norms or problems” in country B in terms of the underlying latent variable of mobility: norms or expectations may be lower in country A.expectations may be lower in country A.
WHO ExperienceWHO Experience
Measurement and Health Information Systems
• Enhancing Cross-Population ComparabilityEnhancing Cross-Population Comparability– Two strategies: use the hierarchical ordered probit (HOPIT) model with (a) Two strategies: use the hierarchical ordered probit (HOPIT) model with (a) vignettesvignettes, and (b) , and (b)
measured testsmeasured tests. .
– A vignette is a hypothetical description of a level of ability on a given domain which respondents A vignette is a hypothetical description of a level of ability on a given domain which respondents are asked to evaluate with respect to same question -- and on the same categorical response are asked to evaluate with respect to same question -- and on the same categorical response scale -- as the main self-report for each of the domains in health and in responsiveness.scale -- as the main self-report for each of the domains in health and in responsiveness.
– Vignettes allow us to fix the level of ability such that any variations are attributable to variations in Vignettes allow us to fix the level of ability such that any variations are attributable to variations in response category cut-point shifts.response category cut-point shifts.
– Using the HOPIT model, these variations in estimated cut-points across socio-demographic Using the HOPIT model, these variations in estimated cut-points across socio-demographic groups can then be used to calibrate self-report responses to make them cross-population groups can then be used to calibrate self-report responses to make them cross-population comparable. comparable.
WHO ExperienceWHO Experience
Measurement and Health Information Systems
• Problems with self-reported morbidity / diagnosisProblems with self-reported morbidity / diagnosis
– Variations across countries lack face validityVariations across countries lack face validity– Probably more reflective of service availability and knowledgeProbably more reflective of service availability and knowledge
WHO ExperienceWHO Experience
Measurement and Health Information Systems
0.1
.2.3
.4.5
Probit P
rev
15-29 30-44 45-59 60-69 70-79 80+agecat
Avg. Disability for Males, Unadjusted Survey Data
Comparisons across surveysComparisons across surveys
0.1
.2.3
.4.5
Cho
pit P
rev
15-29 30-44 45-59 60-69 70-79 80+agecat
Avg. Disability for Males, Vignette Adjusted Survey Data
0.1.2
.3.4.5
Pro
bit P
rev
15-29 30-44 45-59 60-69 70-79 80+agecat
AUT BELDEU DNKESP FINFRA GBRGRC IRLITA LUXNLD PRTSWE
Avg. Disability for Females, Unadjusted Survey Data
Measurement and Health Information Systems
Survey comparisons – Survey comparisons – post adjustmentpost adjustment
Measurement and Health Information Systems
MEX
0.00 0.05 0.10 0.15
18-25
31-35
41-45
51-55
61-65
71-75
Self-Reported Diagnosis
Males Females
ESP
0.00 0.05 0.10 0.15
18-25
26-30
31-35
36-40
41-45
46-50
51-55
56-60
61-65
66-70
71-75
75+
Self-Reported Diagnosis
Males Females
Prevalence of asthma Self-reported diagnosis
Measurement and Health Information Systems
MEX
0.00 0.05 0.10 0.15
18-25
31-35
41-45
51-55
61-65
71-75
Mean probabilistic diagnosis scale
Males Females
ESP
0.00 0.05 0.10 0.15
18-25
26-30
31-35
36-40
41-45
46-50
51-55
56-60
61-65
66-70
71-75
75+
Mean probabilistic diagnosis scale
Males Females
Prevalence of asthma: Probabilistic diagnosis scale