multi-attribute utility function or statistical inference models: a comparison of health state...

11
Journal of Health Economics 26 (2007) 992–1002 Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system Katherine Stevens a,, Christopher McCabe c , John Brazier a , Jennifer Roberts a,b a Health Economics and Decision Science (HEDS), University of Sheffield, UK b Department of Economics, University of Sheffield, UK c Institute of Health Sciences, University of Leeds, UK Received 20 March 2006; received in revised form 19 September 2006; accepted 21 December 2006 Available online 11 January 2007 Abstract A key issue in health state valuation modelling is the choice of functional form. The two most frequently used preference based instruments adopt different approaches; one based on multi-attribute utility theory (MAUT), the other on statistical analysis. There has been no comparison of these alternative approaches in the context of health economics. We report a comparison of these approaches for the health utilities index mark 2. The statistical inference model predicts more accurately than the one based on MAUT. We discuss possible explanations for the differences in performance, the importance of the findings, and implications for future research. © 2007 Elsevier B.V. All rights reserved. JEL classification: I19 Keywords: Health-related quality of life; Health utilities index; Valuation functions; Predictive validity 1. Introduction A key issue in health state valuation modelling is the choice of functional form (Brazier et al., 1999). Two of the most frequently used preference based instruments; the EQ-5D and Corresponding author at: Health Economics and Decision Science, School of Health and Related Research, University of Sheffield, Regent Court, 30 Regent Street, Sheffield S1 4DA, UK. Tel.: +44 114 222 0841; fax: +44 114 272 4095. E-mail address: K.Stevens@Sheffield.ac.uk (K. Stevens). 0167-6296/$ – see front matter © 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.jhealeco.2006.12.007

Upload: katherine-stevens

Post on 25-Oct-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

Journal of Health Economics 26 (2007) 992–1002

Multi-attribute utility function or statistical inferencemodels: A comparison of health state valuation models

using the HUI2 health state classification system

Katherine Stevens a,∗, Christopher McCabe c,John Brazier a, Jennifer Roberts a,b

a Health Economics and Decision Science (HEDS), University of Sheffield, UKb Department of Economics, University of Sheffield, UKc Institute of Health Sciences, University of Leeds, UK

Received 20 March 2006; received in revised form 19 September 2006; accepted 21 December 2006Available online 11 January 2007

Abstract

A key issue in health state valuation modelling is the choice of functional form. The two most frequentlyused preference based instruments adopt different approaches; one based on multi-attribute utility theory(MAUT), the other on statistical analysis. There has been no comparison of these alternative approaches inthe context of health economics.

We report a comparison of these approaches for the health utilities index mark 2.The statistical inference model predicts more accurately than the one based on MAUT. We discuss possible

explanations for the differences in performance, the importance of the findings, and implications for futureresearch.© 2007 Elsevier B.V. All rights reserved.

JEL classification: I19

Keywords: Health-related quality of life; Health utilities index; Valuation functions; Predictive validity

1. Introduction

A key issue in health state valuation modelling is the choice of functional form (Brazieret al., 1999). Two of the most frequently used preference based instruments; the EQ-5D and

∗ Corresponding author at: Health Economics and Decision Science, School of Health and Related Research, Universityof Sheffield, Regent Court, 30 Regent Street, Sheffield S1 4DA, UK. Tel.: +44 114 222 0841; fax: +44 114 272 4095.

E-mail address: [email protected] (K. Stevens).

0167-6296/$ – see front matter © 2007 Elsevier B.V. All rights reserved.doi:10.1016/j.jhealeco.2006.12.007

Page 2: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002 993

the health utilities index adopt different approaches. The health utilities 2 (HUI 2) and healthutilities 3 (HUI3) valuation models utilise multi-attribute utility theory (MAUT) to identify theappropriate multi-attribute utility function (MAUF) (Keeney and Raiffa, 1976; Torrance et al.,1996; Feeny et al., 2002), whereas the EQ-5D valuation model is estimated using statisticaltechniques with the emphasis on empirical rather than theoretical validity (Dolan, 1997). TheSF-6D has also adopted this statistical approach (referred to as statistical inference models here-after), (Brazier et al., 2002) whereas the assessment of quality of life (AQoL) mark 2 valuationis based on a MAUT approach (Richardson et al., 2004). As far as we are aware, there hasbeen no direct comparison of these alternative approaches in the health economics literature(Brazier et al., 1999). This paper reports the results of applying these two different approachesto the UK valuation of the HUI2 and compares their performance in an external validationdataset.

2. Background

The MAUF form has its foundation in Multi-attribute theory, which is in turn consistent (givencertain assumptions) with Expected Utility Theory (EUT) (Keeney and Raiffa, 1976). As longas at least one of these additional assumptions hold, MAUT will allow the identification of theutility associated with each of the possible outcomes in a classification system.

In addition to the theoretical foundations, a strong appeal of the MAUT approach is thatit specifies the states for which utilities must be measured to parameterise the utility function(Torrance et al., 1996).

Whilst there is some evidence that stated preferences are consistent with subsequent choicesand behaviour (Feeny, 2005) there is also an extensive literature demonstrating that observedchoice behaviour is not consistent with EUT (Schoemaker, 1982). In this context, the value ofconsistency with theory is unclear and researchers are left without a strong rationale for choosingbetween the MAUF and statistical approach. Instead, the ability of the models to predict valuesand choices becomes the primary method for identifying the preferred utility function (Dolan,1997; Brazier et al., 2002).

The statistical inference (SI) approach is not without its problems; there is no theoreticalbasis for the selection of functional form to be estimated. The absence of a theory describing therelationship between the attributes within a multi-attribute system means there is little guidancefor identifying the health states required to estimate the utility function. A further limitation ofthe SI approach is that there is no basis on which to identify the minimum number of valuationsper health state required. UK valuation studies of the EQ-5D, SF-6D and HUI2 have adopted verydifferent approaches to each of these issues (Dolan, 1997; Brazier et al., 2002; McCabe et al.,2005a).

It has been suggested that as the size of the descriptive system increases, the MAUF approachbecomes more appealing as it keeps the number of states to be valued small in comparisonto the SI approach (Dolan, 2002). However, this is a purely operational consideration andif the resulting model is a poor representation of preferences, convenience is an inadequatejustification.

In the only direct comparison of the MAUT and statistical approaches we identified, Cur-rim and Sarin investigated the job choice of 100 MBA graduate students in the United Statesusing a self-administered questionnaire (Currim and Sarin, 1984). They reported that the SImodel outperformed algebraic models (conjoint and MAUF) in predicting the observed jobchoices.

Page 3: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

994 K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002

3. Objective

The aim of the study was to compare the performance of a Multi-attribute utility function andSI valuation model for the health utilities index mark 2 (HUI2) health state classification systemat predicting health state values (McCabe et al., 2005a,b).

4. Methods

The HUI2 is the only preference based multi-attribute health related quality of life instrumentspecifically developed for use with children (Torrance et al., 1996). It consists of seven attributes(sensation, mobility, emotion, cognition, self care, pain and fertility), each of which has betweenthree and five levels. The levels describe a range of functioning, from ‘normal functioning for age’to ‘extreme disability’. By assuming fertility to be normal, the HUI2 can be viewed as a genericmeasure. We used this revised descriptive system (excluding fertility) in the valuation surveys.Appendix A gives the descriptive system in full.

5. Valuation surveys

Three health state valuation surveys were undertaken to carry out this study. Fig. 1 gives detailsof the surveys, including the number of respondents, number of health states valued, methods ofvaluation and analyses undertaken. We designed the study to replicate the methods of Torrance etal. in their original valuation of the HUI2 and also to compare this to another method of valuation(the SI approach) that had been used more recently in the valuation of the SF-6D. For this reason,we had one survey for an MAUF function, one for a SI function and a small external validationdataset.

Survey 1 was designed to parameterise a multiplicative multi-attribute utility function for theHUI2, following the methods described by Torrance et al. for the original HUI2 valuation study(Torrance et al., 1996). This survey had 201 respondents, approximately equal to the number ofrespondents in the original HUI2 valuation study (Torrance et al., 1996). Survey 2 was designedto parameterise a linear additive health state valuation model, using the SI approach. The designof this survey was based on the valuation work of the SF-6D (Brazier et al., 2002) and had 198respondents.

SI does not have to be linear additive but the data requirements to estimate more complexmodels would be prohibitive. Also, as there is not really any theory (other than MAUT) to defend

Fig. 1. Health state valuation surveys. Analyses: estimating visual analogue scale to standard gamble power curve mappingfunction, Survey 1; parameterise multi-attribute utility functions, Survey 1; estimating statistical inference valuation model,Survey 2. Comparing predictive performance of the MAUF and Statistical Inference valuation models, Survey 3.

Page 4: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002 995

more complex models it seems reasonable to take this more parsimonious approach and use alinear additive model as a kind of proxy for a more complex form.

Survey 3 was designed to provide a small validation dataset of direct health state valuations toassess the predictive performance of the MAUF and SI models. It used the same valuation scriptas Survey 2 and had 51 respondents. The size of this sample was small due to resource constraints.

6. MAUF methods

The original MAUF methods by Torrance et al. (Torrance et al., 1996) involved valuing singleattribute states (valuing the levels within a dimension) using the visual analogue scale (VAS).Subsequently, 12 multi-attribute health states were valued using the VAS. These 12 states includedall the corner states (where a health state contains one dimensions at its worst functioning and allother levels at their best functioning), 4 other health states and the worst health state. Four of these12 health states were also valued using the standard gamble (SG) technique relative to perfecthealth and dead. As VAS measures preferences under certainty, it is said to give value scores andas SG measures preferences under uncertainty, it is said to give utility scores. A mapping functionwas estimated between the VAS values and SG utilities for these four states and used to transformthe remaining VAS values into utilities. These utility values are used to parameterise the MAUF.We replicated this approach here, making adjustments to the health states, as we did not includethe fertility dimension (McCabe et al., 2005b).

The multiplicative utility function for the HUI2 has the following form: (Torrance et al., 1992)

U(x) =(

1

k

)[

j=1(1 + kkjuj − 1)] (1)

Where:

1 + k = 6Π

j=1(1 + kkj) (2)

uj is the single attribute utility function for attribute j, U(x) is the utility of health state x representedby an x-element vector, k and kj are model parameters. This is the utility formulation of the MAUF.Following the methods of Torrance et al. (1996), we constructed the MAUF to predict the disutilityof a health state (1 minus utility) in order to make the measurement task easier. Therefore weestimated the disutility formulation of the MAUF, where the model parameters are denoted as dand dj, respectively (McCabe et al., 2005b).

7. Statistical inference methods

The statistical approach in Survey 2 involved collecting SG data on 51 health states and thenestimating a range of models to predict utilities for all other health states in the descriptive system,using a range of functional forms. The states were made up of 49 that formed an orthogonal arrayfor estimating an additive model, along with the worst health state described by the HUI2. Anorthogonal array selects an array of health states that need to be valued in order to estimate anadditive model. Perfect health was selected in the orthogonal array but as its value is fixed invaluation tasks, we substituted 2 further health states, giving a total of 51 states. The SG methodwas relative to perfect health and dead. The preferred model was chosen on the basis of predictiveperformance, i.e. the ability to predict observed utilities.

Page 5: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

996 K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002

Full methods for estimating the MAUF and Statistical models are reported in detail elsewhere(McCabe et al., 2005a,b).

8. Assessing predictive performance

Torrance et al. assessed the predictive performance of the original MAUF for the HUI2 byexamining how accurately the model predicted the mean health state utilities for the four healthstates valued using SG relative to perfect health and dead (Torrance et al., 1996). Whilst thesestates were not strictly used to parameterise the MAUF, they were used in estimating the powercurve function, and therefore had some influence on the MAUF parameters.

In contrast to the original HUI2 valuation study, (Torrance et al., 1996) we assess the predictiveperformance of the new MAUF and the SI model using a completely independent (validation)dataset collected via Survey 3. We collect SG data on 14 health states selected by generating anew orthogonal array and selecting the first 14 states different to those used in Surveys 1 and 2,plus the worst health state, giving a total of 15 health states.

We report a number of complementary measures of predictive performance, the root meansquare error (RMSE), the mean absolute error (MAE) and the number of states predicted towithin 0.03 of the observed mean health state utility. This threshold was chosen on the basis ofDrummond’s identification of this value as a clinically meaningful difference in utility score onthe HUI 2 (Drummond, 2001). In addition, we test for bias in the predictions, by undertaking at-test that the mean value of the prediction errors is significantly different to zero.

9. Results

Table 1 reports the socio-economic characteristics of the respondents in each of the threevaluation surveys. The characteristics are broadly similar across the three populations, with Survey1 having slightly more people with a mortgage and Survey 3 having more self employed people.

In Survey 3, 51 interviews were completed. Adopting the same exclusion criteria as used inthe MAUF and SI model estimation studies (McCabe et al., 2005a,b) led to no exclusions. Theaverage number of observations per health state was 25.3.

Table 1Socio-economic characteristics of respondents to Surveys 1, 2 and 3

Characteristic Survey 1 Survey 2 Survey 3

Mean age (years) 52 54 54% Male 36 40 43% Married or living as married 69 69 71% Renting property 14 12 10% With mortgage 42 31 33% Own property outright 42 53 57% Self-employed 13 12 20% Full time work 66 60 65% Part time work 34 40 35% Unemployed 4 3 2% Permanently sick/disabled 3 5 0% Retired 29 36 31% In full time education 2 3 2% With a higher/first degree 25 19 22

Page 6: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002 997

Table 2Multi-attribute disutility function on perfect-health-PITS* scale

Level Sensation Mobility Emotion Cognition Self-care Pain

1 0.000 0.000 0.000 0.000 0.000 0.0002 0.291 0.257 0.229 0.257 0.252 0.1453 0.508 0.746 0.488 0.516 0.588 0.4044 1.000 0.942 0.685 1.000 1.000 0.6645 1.000 1.000 1.000

D dsens dmob demot dcog Ds c dpain

−0.935 0.382 0.274 0.470 0.315 0.130 0.650Death 0.932

* Where PITS is the worst health state in the descriptive system.

The MAUF model and SI model estimation results have been reported in detail elsewhere,(McCabe et al., 2005a,b) however we provide a summary of the results in Tables 2 and 3. Table 2reports the results of the MAUF model, which shows all the data required to parameterise theMAUF. Table 3 reports the preferred SI model (the OLS model). All the coefficients in the SI

Table 3Statistical inference valuation model for HUI2 (OLS model)

Health state attribute level Co-efficient

Constant 1Sensation 2 −0.114Sensation 3 −0.123Sensation 4 −0.225Mobility 2 −0.051Mobility 3 −0.122Mobility 4 −0.131Mobility 5 −0.113Emotion 2 −0.094Emotion 3 −0.112Emotion 4 −0.181Emotion 5 −0.184Cognition 2 −0.055Cognition 3 −0.096Cognition 4 −0.168Self care2 −0.052Self care3 −0.114Self care4 −0.117Pain 2 −0.110Pain 3 −0.116Pain 4 −0.161Pain 5 −0.255

N 1370(Adj) R2* 0.77

* This is regression through the origin and therefore the R-squared result is not directly comparable to models in whichthe constant is estimated. All co-efficients are significant at the 0.1 level. Health state values are obtained by subtractingthe co-efficient for each attribute level in the health state from the constant (1.0); where perfect health = 1 and dead = 0.

Page 7: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

998 K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002

Fig. 2. Observed and predicted values. Note: Health states are ordered in ascending order of mean utility scores fromSurvey 3 (validation survey).

model are significant and they are all consistent (a lower level on a dimension is associated witha larger decrement), except for mobility level 5. 201 interviews were completed in Survey 1 and198 interviews were completed in Survey 2.

Fig. 2 plots the observed mean health state utilities for each of the 15 health states in thevalidation sample and the predicted health state utility for the MAUF and the SI valuation models.

The SI model predicts much more accurately than the MAUF, (RMSE = 0.05 versus 0.16;MAE = 0.04 versus 0.13) and many more of the health states are predicted to within the ‘clinicallysignificant’ range (7 versus 1). The MAUF model produced biased predictions (t = 6.84; p < 0.001);but the SI model did not (t = 1.09; p > 0.29).

10. Discussion

In this paper we report the first direct comparison of a MAUF and SI utility model in thehealth economics literature (Brazier et al., 1999). Taken at face value, our results suggest thatthe simple linear additive model may be superior to the multiplicative MAUF for representingpreferences over health states in the HUI2 classification as it predicts health state utilities moreaccurately.

The superiority of the SI linear additive model suggests that that there are no interactionsbetween attributes. Such a conclusion strikes us as counter-intuitive, and is not consistent withthe findings of the UK EQ-5D or SF-6D valuation studies (Dolan, 1997; Brazier et al., 2002).It may be that with a larger sample size, we may have found interactions. In addition, the num-ber of health states required to do this in the orthogonal array was much larger and beyondour resource constraints. However, this may be something that could be explored in futureresearch.

There are a number of limitations to this work as a test of MAUF versus SI models. The dataused to parameterise the MAUF is heavily based on VAS data. This is now recognised as beingsubject to a number of biases, most notably range-frequency, or spreading bias (Torrance et al.,2001). This bias may have a substantial impact upon the MAUF in at least two ways. First, thechoice of MAUF specification (additive, multiplicative or multi-linear) is determined by the sumof the values of the corner states. The corner states are valued by adding one state at a time onto the same VAS and therefore spreading bias makes it likely that the sum of their values will

Page 8: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002 999

be greater than one (>1), which is the criteria for fitting a multiplicative MAUF; even if the truevalues for these corner states would sum to less than one. If the true values were to sum to lessthan one, then each state would have to have an average value below 0.2. Thus, the basis on whichthe multiplicative MAUF is fitted may be as a result of bias in the VAS data rather than true healthstate values.

In addition, as the single attribute health states are valued together on the VAS for each attribute,the values are likely to be affected by the number of levels in each attribute rather than capturingonly the preference for each level within the attribute.

The VAS values are converted to SG utilities using a power curve mapping function, based uponthe theory of a constant relative risk aversion advanced by Dyer and Sarin (Dyer and Sarin, 1982).Successive empirical analyses have found the power curve to be a poor model of the relationshipbetween the VAS and SG in individual and aggregate data (Dolan and Sutton, 1997; Robinson etal., 2001; Bleichrodt and Johannesson, 1997). The power function used in our analyses has poorpredictive performance even within the estimation sample and alternative functional forms leadto very different health state utilities (McCabe et al., 2005b).

In a separate paper we have investigated the use of alternative function forms to the powercurve and found the cubic function to be the best performing in terms of predictive validity(McCabe et al., 2005b). Even though this outperforms the power function, when compared tothe SI model, the SI model still has the greater predictive ability. As different mapping functionslead to different predictions, it is therefore unclear whether the poorer performance of the MAUFis due to a problem with the mapping function, or is a problem with the MAUF itself. There-fore, research which compares a directly valued MAUF (with no mapping) with a SI model, isdesirable.

Torrance and colleagues used VAS to collect the health state value data as they needed eachrespondent to provide a full set of parameters for the MAUF in order to construct individualMAUFs. Using SG or the time trade off (TTO) method to obtain utilities for the 27 directly valuedparameters in the MAUF was not feasible. However, when the purpose is to estimate a populationMAUF, it is not necessary for all respondents to provide observations on all parameters. Feenyand colleagues recognised this in the valuation work for the health utilities index mark 3, (Feenyet al., 2002) although they persisted with the use of VAS, as they believe VAS to be a useful ‘warmup’ exercise, to introduce and familiarize the respondent with the health states. To fully assess therelative merits of MAUF and SI approaches it would be desirable to use either SG or TTO dataonly in both methods.

A further limitation of the work presented here is that the HUI2 classification system doesnot have completely independent attributes. Specifically there are interactions between the self-care and mobility attributes. These interactions mean that the self-care and mobility corner stateutilities have to be estimated, rather than directly measured. This estimation assumed that a twoattribute multi-attribute utility function for mobility and self-care is nested within the full MAUFand solving the two-attribute MAUF required additional assumptions which cannot be empiricallytested (Torrance et al., 1992).

Alternative assumptions could lead to different health state utility predictions that could in turnaffect the predictive performance of the MAUF.

There is an increasing interest in the ability of respondents to process the information presentedto them in health state valuation exercises (Lloyd, 2003). The conventional view in the health statevaluation literature is that individuals can retain around seven separate pieces of information inmaking a single decision (Miller, 1956). However, more recent research has suggested that it isvery unlikely that people actually consider or process several independent pieces of information

Page 9: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

1000 K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002

when making decisions, instead they may often employ heuristics to simplify the tasks they aregiven, which means they may ignore much of the information they are presented with (Lloyd,2003).

The valuation process for the MAUF, particularly the simultaneous valuation of 12 multi-attribute health states on a single VAS scale, is likely to swamp the respondents’ informationprocessing capacity. Thus, the data obtained may not reflect true preferences so much as theheuristics that individuals use to navigate complex information processing tasks. Thus, the appar-ently superior performance of the linear additive statistical model may mean that it is a betterproxy for the heuristic processes than the MAUF, and not that preferences for health states arewell represented by a linear additive function.

It is also possible that the difference in the performance of the models may be attributable todifferences in the socio-economic (or other) characteristics of the two estimation samples, andbetween the estimation samples and the validation sample; i.e. one or more of the groups mayhave genuinely valued the health states higher or lower than one or more of the other groups.However, such differences in values are likely to be small.

Whilst acknowledging all these alternative explanations for the observed difference in thepredictive performance, it is notable that our finding in favour of the SI model is consistent withthat reported by Currim and Sarin, although this study was in a different context (Currim andSarin, 1984).

11. Conclusion

We have presented the first direct comparison of an MAUF and SI health state valuationmodel. The results indicate that the SI model has a better predictive performance than the MAUF.In principle, these results are a challenge to the MAUT, which provides the theoretical foundationfor the MAUF. However, the difference in performance cannot be ascribed to the choice offunctional form alone. The reliance on VAS data and the empirically dubious power functionrelationship between VAS and SG are substantial reasons to doubt that the parameter valuesactually reflect respondent’s preferences over health states in the HUI2. Furthermore, the use ofvaluation techniques that are insensitive to individuals’ information processing capacity bringsthe meaning of the results into question.

In the context of increasing use of preference based health status measures in resource allocationdecision processes, the lack of research on the appropriate functional form for health state valuationmodels represents a significant gap in the knowledge base. The research presented in this paperindicates that the MAUF approach has a case to answer. The MAUF model predictions appearto be inaccurate to a clinically meaningful degree as compared to the SI model. Comparativestudies of MAUF and alternative SI models, which address the limitations of our work, would bevaluable.

Acknowledgements

Research funded by the UK Medical Research Council. Development and assessment of riskadjustment methods for outcomes in paediatric intensive care in the United Kingdom. GrantNumber: G9900013.

Page 10: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002 1001

Appendix A

Health utilities index mark 2 (revised without the fertility dimension) Torrance et al. (1996)

Level

Sensation1 Able to see, hear and speak normally for age2 Requires equipment to see or hear or speak3 Sees, hears, or speaks with limitations even with equipment4 Blind, deaf, or mute

Mobility1 Able to walk, bend, lift, jump and run normally for age2 Walks, bends, lifts, jumps or runs with difficulty but does not require help3 Requires mechanical equipment (such as canes, crutches, braces or a wheelchair) to walk or get

around independently4 Requires the help of another person to walk or get around and requires mechanical equipment5 Unable to control or use arms or legs

Emotion1 Generally happy and free from worry2 Occasionally fretful, angry, irritable, anxious depressed or suffering from “night terrors”3 Often fretful, angry, irritable, anxious depressed or suffering from “night terrors”4 Almost always fretful, angry, irritable, anxious, depressed5 Extremely fretful, angry, irritable, anxious or depressed usually requiring hospitalisation or

psychiatric institutional care

Self-care1 Eats, bathes, dresses and uses the toilet normally for age2 Eats, bathes, dresses or uses the toilet independently with difficulty3 Requires mechanical equipment to eat, bathe, dress, or use the toilet independently4 Requires the help of another person to eat, bathe, dress or use the toilet

Cognition1 Learns and remembers schoolwork normally for age2 Learns and remembers schoolwork more slowly than classmates as judged by parents and/or

teachers3 Learns and remembers very slowly and usually requires special educational assistance4 Unable to learn and remember

Pain1 Free of pain and discomfort2 Occasional pain. Discomfort relieved by non-prescription drugs or self-control activity without

disruption of normal activities3 Frequent pain. Discomfort relieved by oral medicines with occasional disruption of normal activities4 Frequent pain. Frequent disruption of normal activities. Discomfort requires prescription narcotics

for relief5 Severe pain. Pain not relieved by drugs and constantly disrupts normal activities

References

Bleichrodt, H., Johannesson, M., 1997. An experimental test of a theoretical foundation for rating scale valuations. MedicalDecision Making 17, 208–216.

Brazier, J.E., Deverill, M., Green, C., Harper, R., Booth, A., 1999. A review of the use of health status measures ineconomic evaluation. In: Health Technology Assessment 3. NCCHTA, Southampton.

Page 11: Multi-attribute utility function or statistical inference models: A comparison of health state valuation models using the HUI2 health state classification system

1002 K. Stevens et al. / Journal of Health Economics 26 (2007) 992–1002

Brazier, J., Roberts, J., Deverill, M., 2002. The estimation of a preference based measure of health from the SF-36. Journalof Health Economics 21, 271–292.

Currim, I.S., Sarin, R.K., 1984. A comparative evaluation of multi-attribute consumer preference models. ManagementScience 30, 543–561.

Dolan, P., 1997. Modelling valuations for EuroQol Health States. Medical Care 35, 1095–1108.Dolan, P., 2002. Modelling the relationship between the description and valuation of health states. In: Murray, C.K.L.,

Salomon, J.A., Mathers, C.D., Loped, A.D. (Eds.), Summary Measures of Population Health: Concepts, Ethics,Measurement and Applications. World Health Organisation, Geneva, pp. 501–514.

Dolan, P., Sutton, M., 1997. Mapping Visual analogue scale health state valuations on to standard gamble and time tradeoff values. Social Science and Medicine 44, 1519–1530.

Drummond, M.F., 2001. Introducing economic and quality of life measurements into clinical studies. Annals of Medicine33, 344–349.

Dyer, J.S., Sarin, R.K., 1982. Relative risk aversion. Management Science 28 (8), 875–886.Feeny, D., Furlong, W., Torrance, G.W., Goldsmith, C.H., Zhu, Z., DePauw, S., Denton, M., Boyle, M., 2002. Multiattribute

and single attribute utility functions for the Health Utilities Index Mark 3 system. Medical Care 40, 113–128.Feeny, D, 2005. Comparing and contrasting utilities and willingness to pay. In: Lenderking, W.R., Revicki, D.A. (Eds.),

Advancing Health Outcomes Research Methods and Clinical Applications. Degnon Associates, McLean, VA, pp.353–367.

Keeney, R.L., Raiffa, H., 1976. Decisions With Multiple Objectives: Preferences and Value Tradeoffs. Wiley, New York.Lloyd, A.J., 2003. Threats to the estimation of benefit: are preference elicitation methods accurate? Health Economics

12, 393–402.McCabe, C., Stevens, K., Roberts, J., Brazier, J.E., 2005a. Health State Values for the Health Utilities Index Mark 2

descriptive system: results from a UK valuation survey. Health Economics 14 (3), 231–244.McCabe, C., Stevens, K., Brazier, J.E., 2005b. Utility scores for the Health Utilities Index Mark 2: an empirical assessment

of alternative mapping functions. Medical Care 43, 627–635.Miller, G.A., 1956. The magical number seven plus or minus two: some limits on our capacity to process information.

Psychological Review 63, 81–97. Cited in: Fischer, G.W., 1978. Utility Models for multiple objective decisions: dothey represent human preferences? Decision Sciences 10, 451–479.

Richardson, J., Day, N., Peacock, S., Lezzi, A., 2004. Measure of the quality of life for economics evaluation and theAssessment of Quality of Life (AQoL) Mark 2 instrument. The Australian Economic Review 37 (1), 62–88.

Robinson, A., Loomes, G., Jones-Lee, M., 2001. Visual analogue scales, standard gambles and relative risk aversion.Medical Decision Making 21, 17–27.

Schoemaker, P.H.J., 1982. The expected utility model: its variants, purposes, evidence and limitations. Journal of EconomicLiterature 20, 529–563.

Torrance, G.W., Feeny, D., Furlong, W., 2001. Visual Analogue Scales: do they have a role in the measurement ofpreferences for health states? Medical Decision Making 21, 329–334.

Torrance, G.W., Feeny, D.H., Furlong, W.J., Barr, R.D., Zhang, Y., Wang, Q., 1996. A multi-attribute utility function fora comprehensive health status classification system: Health Utilities Mark 2. Medical Care 34, 702–722.

Torrance, G.W., Zhang, Y., Feeny, D., Furlong, W., Barr, R., September 1992. Multi-attribute preference functions fora comprehensive health status classification system. McMaster University Centre for Health Economics and PolicyAnalysis Working Paper #92–18.