prof. gouke j bonsel mph md phd public health methods obstetrics
DESCRIPTION
Working Paper No.10 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOREUROPEAN COMMUNITIES EUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTH STATISTICIANS ORGANIZATION (WHO) Joint UNECE/WHO/Eurostat Meeting - PowerPoint PPT PresentationTRANSCRIPT
Can secondary analysis teach us on best practice of universal QoL measurement
Arguments and (some) Evidence
Prof. Gouke J Bonsel MPH MD PhD
Public Health MethodsObstetrics
Academic Medical Centre - University of Amsterdam
Working Paper No.1021 November 2005
STATISTICAL COMMISSION and STATISTICAL OFFICE OF THEUN ECONOMIC COMMISSION FOR EUROPEAN COMMUNITIESEUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTHSTATISTICIANS ORGANIZATION (WHO)
Joint UNECE/WHO/Eurostat Meetingon the Measurement of Health Status (Budapest, Hungary, 14-16 November 2005)
Session 5 – Invited paper
051116 Budapest 2
Agenda
• Comparative Secondary analysis: wanted?• Goals of Measurement
– Contents– Process
• C2A– Quantitative - Validity– Qualitative - Q/D Vignette– Quantitative - Coverage/Refinement
general belief: many issues can be resolved by data
051116 Budapest 3
Comparative secondary analysis (C2A)
• >2 crude datasets with– known questionnaire + codification rules– known population (at least vs. general)– sharing > 1 intended concept– sufficient common question/response types – sufficient language commonalities
• special cases– 1 questionnaire, n populations– n questionnaires, > 1 populations
051116 Budapest 4
Comparative secondary analysis : types
• quantitative, analytical content-driven methods; with and w/o external criterion
• quantitative, descriptive (technical) performance methods
• qualitative, semantics• qualitative comparison response form, other
operational features
all head-to-head analysis will assume some aspects
to be constant over the units to be compared
051116 Budapest 5
Goals of QoL measurementCONTENTS
• Intrinsic goals of health systemsWHO (+EU?)
– Health (DALE-like; class) Level Distribution– Responsiveness Level Distribution– Fairness of financing Distribution
Washington– Monitoring health population [Health Level]– Care provision [Responsiveness+ Level]– Equal pursuit [Health+Responsiveness
Distribution]
• External goals (GJB)– Employment, autonomy, reproduction
051116 Budapest 6
Goals of QoL measurementCONTENTS
• Health State measurement (per domain)– multi-item classical test Q (mQ): no– ordinal classification (class): yes– cf. ItemResponseTheory calibrated : perhaps
• Suitability for index development– in general : perhaps– to compose QALY/DALY estimates : yes
(but do not tell)
• Projection from mission WHO; to existing instruments and accepted classifications
051116 Budapest 7
Goals of QoL measurementPROCESS
• Efficient Elaboration
• Reliable Elaboration
• Universality of acceptance
• Flexibility of mode of administration
• Low price, low burden
• Fancy appearance
051116 Budapest 8
Some remarks (1)• Domains
– normal is absence of dys[...]. avoid ‘better than normal’ discussion (concept: health is positive, item: happy instead of downhearted). think of playing music: there is no better than playing on the beat
– from ALL external criteria, except ease of measurement and peace of mind follows about equal space for physical versus psychological domains; less (not absent) for social
– projection WHO mission, WHO classifications, other instruments: ex post or ex ante
– take care for conceptual unidimensionality artefact and the interpretation of empirical correlation as redundanceclassification nor IRT ‘require’ empirical independence
051116 Budapest 9
Some remarks (2)• Domains & Items & Time
– (pattern over) time is an essential conceptual component, recall technicalities of minor consideration.
– all concepts are continuous over time but some state changes appear as events or episodes or chronic states, or can only defined on (restricted) activity (=event) base hence frequency and intensity to some extent are semantic convention
– consequences:• time can emerge in pre-ambule, item, and response. uniformity
over the questionnaire essential. people ignore pre-ambules• empirical (pattern over) time therefore decides on ‘frequency’ or
‘intensity’, but on average both are relevant• experience tells that virtually all domains have day-to-day
fluctuations, if unstandardized response is during best condition• graphical tools useful if unidmensional item, sofar academic
051116 Budapest 10
Some remarks (3)• Items / Response
– burden of 3 domains * 6 responses smaller than 6 domains * 3 responses
– distributional economy ignored; 2 levels is not best, subjective scale experience does not apply; filtering assumes errorless contextfree threshold judgment. Shannon’s methodology
– equilizing in semantics across young/old, man/women, rich/poor, nationality or culture standardizes rather than exposes desired? differences
– contextual aspects often ignored; also suitability for translation
– reliability information (across time, observers, mode of administration) scarce
051116 Budapest 11
C2A: Quantitative Head-to-head Validity
• With external criterion – domain specific consequences or etiology and
personal chars with prespecified relation. strength of association (preferably RR)
– examples• psychological domain - use of specific care,
suicide; preceding life events• mobility domain - use of physiotherapy, aids;
fracture preceding period• cognitive domain - age
051116 Budapest 12
C2A: Quantitative Head-to-head Validity
• Without external criterion– domain relations. prespecified patterns. strongly
dependent on population (random if about healthy). comparison difficult if scale type differs (mQ, class, IRT)
– special case if measure is contained as anchor– ex.
• psychological domains vs. physical domains• all domains vs. HUI-Ambulation or EQ-Mobility
051116 Budapest 13
C2A: Quantitative Head-to-head Validity
• Without internal cutpoint calibration information– Domainwise IRT analysis
• With internal cutpoint calibration information (vignettes)– Domainwise CHOPIT like analysis
calibration: difficult but essential ALSO in countries
051116 Budapest 14
C2A: Qualitative Head-to-head
• Suitability to compose vignettes (timeless states, annual profiles) to arrive at Q/D values– self-reflective domain terms – linguistic (non-numerical), objective response
mode– clearcut time aspect – across domains ‘uniformity’ of terms, categories
and time
051116 Budapest 15
C2A: Quantitative Head-to-headEfficiency
• Source: investigations supporting increase of levels of EQ5D3L (‘HUI-fication’)
• No methods available to demonstrate benefit of more refinement
• Method: Shannon’s informativity measure = non-parametric (desirable) quantifier. Source US study http://www.ahrq.gov/rice/ and Med Care 2005;43:203-20&221-28
051116 Budapest 16
C2A EXAMPLEEQ-5D, HUI2 and HUI3 dimensions with # levels and #
unique permutations defined by full descriptive system. Common Dimensions are Grey
Level descriptionscommon domains
EQ-5D, HUI2 & HUI3
051116 Budapest 18
Absolute and % distribution of responses EQ-5D, HUI2 &
HUI3 (N = 3691)
From the number of
potential categoriesand observed frequencies
we can compute
Shannon numbers
The more equally distributed
the more info
the better reliability
the better sensitivity
051116 Budapest 19
H’ and J’ with skewed and rectangular
distributions in 3 level vs. 5 level system
Shannon numbers
are cardinal
051116 Budapest 20
H’ and J’ with skewed and rectangular
distributions in 3 level vs. 5 level system
If system extended
but potential categoriesare not occupied
then
absolute Shannon H same
relative Shannon J lower
051116 Budapest 21
Shannon’s Absolute Index (H’) and Evenness Index (J’) for the Common Domains of EQ-5D, HUI2 & HUI3.
051116 Budapest 23
ConclusionsC2A Efficiency by Shannon
• Head-to-head comparison tools allows choices on information gain by extension or recalibration
• Non-parametrically = advantage as independent from cutpoint (re)estimation
• In healthy or ambulatory diseased populations EQ5D3L equals HUI’s for common domains
• To be combined with differential cutpoint evaluation and reliability !
straightforwardly applicable for C2A
to WHO/EU data if similar population or experimentation
051116 Budapest 24
Reliability
• Systematic info to select item/respons– domain^respons * time (retest)– domain^respons * respondent (interobserver)– domain^respons * administration (retest)
• EQ5D: 3, 4 or 5– experiment on representative panel under
controlled conditions comparing 3L - 5L - RS– error, ‘filling the space’ and reliability
051116 Budapest 25
The task: Classify/Rate ‘Self’ and Disease vignettes
? = Response = 3L, 5L, or horizontal unanchored VAS
051116 Budapest 27
Inconstencies between 3L and 5L responses
by dimension, all 15 health vignettes (N = 82)
3L to 5L no error increase
051116 Budapest 28
Inter-observer reliability 3L vs 5L, 15 vignettes5L much better !
051116 Budapest 29
Test-retest reliability for respondents’ own health (3 wk interval) with ICC: 5L best !
051116 Budapest 31
Aaverage 3Lrs, 5Lrs and RS mean values by dimension, all
diseases and self-reported health. 3L and 5L values are transformed (linear) to RS
scale range (0-100)
051116 Budapest 32
Indirect and direct quantification of levels terms (n = 1230) Midway = 1/3 rate rule
051116 Budapest 33
Shannon’s index (H’) and Shannon’s Evenness index (J’) values for 3L and 5L. Comparison by dimension
051116 Budapest 34
Conclusions C2A Reliability of reponse terms
• Balance of 3 vs. 5 in favour of 5(after WHO-choice)– error increase low– reliability better– Shannon rises (much)
• Fairly easy to investigate if great # of respondents
• C2A if multiple respons formats for 1 domain
051116 Budapest 35
C2A of other process goals
• Universality of acceptance– quantitative and qualitative C2A depending on
codes for non-respons
• Flexibility of mode of administration– qualitative comparison only
• Fancy appearance– qualitative comparison only
• Low price, low burden– quantitatively possible but who cares?
051116 Budapest 36
Recommendations
• Comprehensive checklist for C2A– starting from structured agreed contents
goals and process/technical goals– distinguishing between quantitative (incl
Shannon) and qualitative research and what remains !
– specify models, techniques and success
• DATA can SOLVE debatesINTERESTING CHOICES remain