comments on wp # 3. discussant: ian mcdowell, university of ottawa, canada working paper no.13 21...

17
Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION and STATISTICAL OFFICE OF THE UN ECONOMIC COMMISSION FOR EUROPEAN COMMUNITIES EUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTH STATISTICIANS ORGANIZATION (WHO) Joint UNECE/WHO/Eurostat Meeting on the Measurement of Health Status (Budapest, Hungary, 14-16 November 2005) Session 3

Upload: barrie-weaver

Post on 11-Jan-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Comments on WP # 3.

Discussant: Ian McDowell,

University of Ottawa,

Canada

Working Paper No.1321 November 2005

STATISTICAL COMMISSION and STATISTICAL OFFICE OF THEUN ECONOMIC COMMISSION FOR EUROPEAN COMMUNITIESEUROPE (EUROSTAT) CONFERENCE OF EUROPEAN WORLD HEALTHSTATISTICIANS ORGANIZATION (WHO)

Joint UNECE/WHO/Eurostat Meetingon the Measurement of Health Status (Budapest, Hungary, 14-16 November 2005)

 Session 3

Page 2: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Clarify Purpose: Description or evaluation? Design implications of each…

Descriptive:• Broad ranging. Goal = to

classify groups

• Themes of interest to people in general (“quality of life”, etc); issues of public concern

• To debate: Emphasize modifiable themes?

• To debate: profile rather than index?

Evaluative:• Content tailored to

intervention; usually not comprehensive

• Needs to be sensitive to change produced by particular intervention

• Focused & fine-grained: select indicators that sample densely from relevant level of severity; unidimensional

• ? emphasis on summary score

Discussion point: does proposed instrument need to serve as an evaluative measure?

Page 3: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Purpose, Performance and Capacity

Descriptivepurposes

Analyticpurposes

Performance

Capacity(with any aids)

Capacity(without aids)

Potential

Unmet needs

Currentpicture

Needsthat have been met

Environment

Page 4: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Parsimony, Sensitivity & Specificity

These are in tension! Need for brevity implies:

• If goal is to have broad coverage of domains (descriptive measure), there can only be few items in each

• To achieve breadth within a domain in few items, we need to use generic items (e.g., the infamous “can you cut your toenails?”)

• This can achieve sensitivity as a screen, but at cost of low specificity: cannot classify type of condition

• Will also lose interpretability and unidimensionality• Point #38: the WP discussion of physical function

illustrates choice between measuring overall, vs. specific functions. Do we care whether it’s knee pain, or muscle weakness, or balance that limits walking ability?

Page 5: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Unidimensionality (point #11)

• IRT goal of unidimensionality is hard to apply in many areas of health measurement. Some topics are hierarchical; symptoms of depression (e.g.) are not, so in IRT analyses, depression or anxiety scales often do not meet unidimensionality criterion

• Unidimensionality is chiefly important for clinical interpretation & maybe evaluation; not the issue here. Surveys focus on how bad it is, not what it is

• If instrument will be scored as an index, the issue of unidimensionality becomes irrelevant as all the items are combined and it’s impossible to visualize the person’s disability anyway

• There is an inherent tension between using generic, screening-type items (e.g., IADLs) and unidimensionality

• Many functions involve more than one body system (e.g., recognizing a face across street), so are not unidimensional

Page 6: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

The Time Frame Debate

• WP 1 says “present”; WP 3 much broader (& varied)

• If sample is large, could use “yesterday” to get prevalence, but will not tell incidence, or duration of condition

• Duration requires additional questions, as does change

• Width of time window not very important: average is just calculated over a shorter or longer time

• Suggest one week (to capture week-ends, etc) or else “yesterday” (as today is incomplete)

Problem!

Change only captured if additional questions asked,

so can’t distinguish A from B

Sampling window

A B C

Page 7: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Time Window & Response Shift

• (Point #13) Larger time windows, and phrasing in terms of “usual” can face issue of response shift (recalibration of person’s view of what is “normal”)

• “Usual” phrasing seems most problematic: may miss chronic disabilities (cf. criticism of GHQ); cannot record incidence, maybe not even prevalence

Actual trajectory

Perception of “usual” function

Typical delay varies according to a range of factors

Response Shift:

Page 8: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Continuous States vs. Episodic Events

• Mobility limitations often endure. By contrast, pain, anxiety or marital disputes are commonly episodic

• Averaging over broad time-window can be an issue for the episodic events (point #15), because

• Averaging episodes raises issue of frequency vs. intensity of events (see next slide)

• In general, time & averaging is less of an issue for capacity than for performance, because capacity is enduring, performance may fluctuate

• However, the notion of capacity is hard to apply to pain, anxiety and depression (in which wording a question in capacity terms tends to approximate performance)

Page 9: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Combining Severity & Frequency (e.g., anxiety questions: point 76; pain, point 97)

• Risk of trying to do too much. The problem of summarizing frequency & severity grows with increasing length of retrospection. If “yesterday” is used, you need only ask about severity

• The term “level” (“How would you describe your level of anxiety?”) is unclear: presumably some combination of severity & frequency of episodes, but how does respondent combine these?

• Options. PhD level: “We want you to judge the overall amount of pain, considering both intensity and frequency, you have experienced …”Simpler: “How bad was your pain?” Mild, moderate, severe…

versus ?

time

Page 10: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Response options: Frequency vs. Difficulty (point # 30)

• For chronic conditions, evidently intensity responses are more appropriate

• For fluctuating conditions (insomnia, depression), frequency seems most appropriate

• If brief recall periods, use intensity responses• For longer-term recall, use frequency • Also, need to decide on relative vs. absolute responses.

E.g., “do you have difficulty keeping up with people your own age?”

• Likewise, do we specify “level ground” for walking, or “where you live.” The first is close to disability and may not be relevant to them, the second (handicap) will be relevant but may make direct comparisons difficult

Page 11: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Discuss Structure of Overall Instrument

• Can it be made dynamic? Item banking; tailored responses; computer administration or using skip patterns. Some examples:

• Cella: http://outcomes.cancer.gov/conference/irt/cella_et_al.pdf

• www.amIhealthy.com

• Ware JE et al. Item banking and the improvement of health status measures. Quality of Life Newsletter 2004; Fall (Special Issue):2-5.

• Bjorner JB et al. Using item response theory to calibrate the Headache Impact Test (HIT) to the metric of traditional headache scales. Qual Life Res 2003; 12:981-1002

Page 12: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Reference for upper level of function

• Best possible function• Compared to your potential• Compared to average person of your age• Without difficulty

• To adjust for age or not?

Page 13: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Prosthetics, Analgesics, etc. (points 20-25)

Rocks & hard places…• Without aids approximates impairment; with aids = disability

• But this distinction is hard to make in ICF: ‘activity’ and ‘participation’ both sound like performance rather than capacity

• Not quite clear why eye glasses are singled out for inclusion, while walking sticks apparently are not

• Asking an amputee about mobility without his prosthesis seems artificial (point #21)

• Likewise, if they are taking effective analgesics, it’s hard for them to report pain without (points #24 & 25)

• If purpose is to indicate health states in this nation, suggest the approach of “using any aids you normally use.”

• Suggest not relying on use of analgesics as way to indicate severity (point #22), because availability will vary greatly

Page 14: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Visual Analogue Scales

• In clinical settings, VAS, NRS pain ratings intercorrelate highly. Verbal scales correlate with both, but less closely

• VAS is visual, so implies use of paper & pencil

• If used in telephone format, VAS reduces to a NRS, so why not just use NRS?

• Less educated and older patients appear to find NRS easier than VAS, so these have been endorsed for use in cancer trials (Moinpour et al., J Natl Cancer Inst 1989; 81:485-495)

• The FLIC began with VAS, but changed to 6-pt NRS

• However, the VAS can be very responsive (e.g., Hagen et al, J Rheumatol 1999; 26:1474-1480). But do we need responsiveness?

• Many alternative formats, including graphic rating scale (Dalton et al, Cancer Nurs 1998; 21:46-49) or box scale (Jensen et al, Clin J Pain 1998; 14:343-349). See also Cella & Perry, Psychol Rep 1986; 59:827-833, and Scott & Huskisson, Pain 1976; 2:175-184.

Page 15: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Anxiety & Depression• Trying to discriminate between these may focus attention on the trees rather

than the forest

• Unitary theory sees A & D as expressions of the same pathology; the opposing perspective sees them as fundamentally different, while the compromise is to view them as having common roots but different expressions (Brown et al, J Abnorm Psychol 1998; 107:179-192).

• Anxiety suggests arousal and an attempt to cope with a situation; depression suggests lack of arousal and withdrawal: the NE and SE quadrants of the diagram (next slide)

• An anxious person might say “That terrible event is not my fault but it may happen again, and I may not be able to cope with it but I’ve got to be ready to try.” A depressed person might say “That terrible event may happen again and I won’t be able to cope with it, and it’s probably my fault anyway so there’s really nothing I can do.” (Barlow DH. The nature of anxiety: anxiety, depression, and emotional disorders. In: Rapee RM, Barlow DH, eds. Chronic anxiety: generalized anxiety disorder and mixed anxiety-depression. New York: Guilford, 1991: 1-28)

Page 16: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

High positive affect

Low positive affect

High negative

affect

Low negative

affect

Disengagement

PleasantnessStrong engagement

Unpleasantness

content,happy,

satisfied

active,elated,excited

aroused,astonished,concerned

relaxed,calm,placid

distressed,fearful,hostile

sad,lonely,

withdrawnsluggish,dull,

drowsy

inactive,still,quiet

Depression

Anxiety

A circumplex model of affect

Page 17: Comments on WP # 3. Discussant: Ian McDowell, University of Ottawa, Canada Working Paper No.13 21 November 2005 STATISTICAL COMMISSION andSTATISTICAL OFFICE

Emotions & Affect: scattered thoughts

• How to fit affect within capacity / performance distinction? Many anxiety questions use either state or performance wordings (“How severe was you anxiety?” or “Did anxiety limit your daily activities?”)

• Why try to distinguish anxiety and depression?

• Not completely clear why we need both positive and negative affect (point #68): if time frame correctly chosen, they should not be orthogonal

• Phrase such as “upset or distressed” may capture general affect quite well

• Stress may also be pertinent: cf. DASS of Lovibond (Manual for the Depression Anxiety Stress Scales. Sydney: Psychology Foundation, 1995)