Willingness to Pay for a Quality‐Adjusted Life‐Year: The Individual Perspective

Introduction

Decisions regarding reimbursement and allocation of funds within the health-care budget increasingly are influenced by the results of cost-effectiveness analysis (CEA). CEA evaluates two or more alternative interventions in terms of their benefits (expressed in a nonmonetary measure) and costs, and summarizes the result in an incremental cost-effectiveness ratio (ICER). The ICER represents the additional costs per additional health unit produced by one intervention in comparison to another. A common measure of health in this context is the quality-adjusted life-year (QALY), which comprises both length and quality of life. When using the QALY as outcome measure, the ICER represents the ratio of incremental costs per QALY gained. Typically, an intervention is considered cost-effective if the ICER falls below a certain cost-effectiveness “threshold,” indicating some monetary value of a QALY. Some 10 years ago, Johannesson and Meltzer 1 argued that without explicating such a threshold value, CEA cannot be considered a proper decision-making tool, as it would lack a systematic and universally recognizable decision criterion.

Recent literature has seen a lively debate on implicit and explicit cost-effectiveness threshold(s), although without reaching consensus on the nature or height of an appropriate monetary value of a QALY 2-7. In the mean time, various institutions and governmental bodies (such as the National Institute for Health and Clinical Excellence [NICE] in the UK, Swedish Pricing and Reimbursement board, Pharmaceutical Benefits Advisory Committee in Australia, CVZ in The Netherlands) have adopted threshold values in the process of optimizing the allocation of health-care resources, albeit sometimes implicitly and inconsistently. The acceptable ranges of the monetary value of a QALY used in such decision-making, however, appear to be broad and tend to lack empirical underpinning 8, 9. This underlines the importance of further investigating the monetary value of a QALY.

The apparent reluctance to research and estimate a “true” value of a QALY has its roots in various arguments and in empirical, theoretical, and methodological challenges inherent to the process of obtaining such a number. For example, there is evidence that the willingness to pay (WTP) for a QALY is nonconstant and dependent on the size, duration, and type of the health gain 10-15. It might thus be impossible to elicit a unique individual WTP for a QALY, as suggested for example by Bleichrodt and Quiggin 16. Matters are additionally complicated by the societal context of decision-making in health care. From the societal perspective, which aligns with the decision-maker's approach, the beneficiaries from health-care services need not be the payers of those services and therefore characteristics other than the size of the health gain may play a role in the valuation of QALYs. A review by Dolan et al. 17, for instance, showed the age of the beneficiary to be an important equity consideration that ought to be included in the social valuations of publicly provided health-care services. The discrepancy between individual and societal valuations, elicited from an ex ante or ex post perspective, could be considerable 18.

In spite of these problems, it is important to continue research in this area and work toward a higher level of transparency and consistency in societal decision-making. Seeking to find appropriate monetary values for QALY gains should not be seen as necessarily attempting to establish a firm link between CBA and CEA, but rather as an aid to decision-makers 12. Indeed, Weinstein 19 recently concluded that “it is time to lay to rest the mythical $50,000 per QALY standard and begin a real public discourse on processes for deciding what health care services are worth paying for.”

This study aimed at eliciting the first empirical estimate of the monetary value of a QALY in The Netherlands. In doing so, it applies a carefully designed questionnaire, which draws on previous studies in this field. Specifically, it uses a contingent valuation approach, from the individual perspective and under certainty, to answer how much are Dutch citizens willing to pay for a QALY gain. This is one of three ways of determining what the optimal cost-effectiveness threshold should be 2. First, the threshold can be inferred from previous decisions taken by leading institutions such as NICE 3, 8. Second, it can be set to exhaust an exogenously determined budget 6. Third, it can be set by identifying the marginal value the society attaches to health. While WTP is a common way of deriving the value of a commodity, only a few studies have applied it in this area 15, 20-22. This study offers a more comprehensive approach to WTP elicitation, in terms of the number of health states valued (in absolute terms and per respondent; e.g., 20, 21, 15), ensuring a good coverage of the QALY scale. The two-step elicitation method WTP used in this study, using a payment scale (PS) followed by a bounded direct follow-up question, is also more comprehensive than was usually applied, because it combines two (linked) elicitation questions in order to arrive at a more precise estimate of the maximum WTP. This was done to combine the ease of a PS with the precision of an open-ended (OE) format. Moreover, throughout, the study applies several different ways of mitigating the hypothetical nature of the exercise. Finally, the robustness of the findings was ensured by sample properties and size, arguably leading to larger generalizability of the results.

Conversely, like previous studies, the current study employs the individual perspective to WTP elicitation. It is the first step in a larger research effort, designed to estimate the societal value of a QALY in The Netherlands. As a part of a larger study, our results offer a reference point for future findings and give important practical insight on how to derive the appropriate values to be used in social decision-making. We also aimed at comparing our findings with the empirical estimates reported in the literature.

In the following sections, we present the methodology and the design of the study. We then present and discuss the main results of the study and results of various subsample analysis. Finally, our aim was to compare the value of a QALY to already existing estimates, and discuss the underlying reasons for the differences we might find. Then, we discuss the practical implications of our findings.

Methods

WTP for a QALY was elicited in a representative sample of the general public in The Netherlands, by means of contingent valuation. Former research showed that the general population (i.e., a heterogeneous, less health-literate sample) elicits more certain, and less volatile health valuations and WTP estimates than patient and/or decision-maker groups 23. The respondents were recruited by a professional Internet sampling company and the questionnaire administered in October 2008 through the Internet. Participants did not receive direct monetary compensation, but a small sum was donated to a charity of their choice, upon completion of the questionnaire.

Survey Instrument

In the introduction to the questionnaire, the respondents were briefed about the purpose and content of the questionnaire, and, to help them understand the WTP exercises, were offered three “warm-up” questions for nonhealth-related items (i.e., their WTP for: 1) a car; 2) housing; and 3) a pair of shoes). Next, the respondents were asked to describe their own health status using the EQ-5D profile and to rate own health, perfect health, and death on the EQ-VAS 21. The respondents had the possibility to adapt the ratings until final confirmation was given.

After this introduction, the respondents solved five choice scenarios. Each scenario contained two EQ-5D health profiles or health states (please note, scenario design is discussed below). The respondents were asked which of the two health states they considered as the better one (see screen 1 in Appendix 1 found at: http://www.ispor.org/Publications/value/ViHsupplementary/ViH13i8_Bobinac.asp) and then requested to place the two health states on a visual analog scale (VAS) showing their previous valuations of current health, death, and perfect health (see screen 2 in Appendix 1). Next, the respondents were asked to imagine being in the health state they had chosen as the better one and to indicate their WTP to avoid spending 1 year in the health state they had chosen as the worse. This health loss (i.e., the difference between the better and the worse health state in the scenario) could be avoided by taking a painless medicine of unspecified properties once a month, for which one had to pay out-of-pocket in 12 monthly instalments (see Appendix 1 at: http://www.ispor.org/Publications/value/ViHsupplementary/ViH13i8_Bobinac.asp for the full question). The vehicle of health improvement was only described as “painless medicine” in order to remove any possible contamination of the health gain evaluation according to the means by which that improvement would be brought about 24.

Next, the WTP was elicited in a two-step procedure: first, a PS 25-29 was offered, followed by a bounded “OE” question. The boundaries in the “OE” question were determined by the amounts the respondents had indicated to certainly pay or certainly not pay in the PS phase.

In particular, in the first step, the respondents were presented with an ordered low-to-high PS of monthly installments (in €: 0, 10, 15, 25, 50, 75, 100, 125, 150, 250, 300, 500, 750, 1000, 1500, 2500), and asked to indicate the maximum amount they would certainly pay (screen 3) and the first amount they would certainly not pay (screen 4 27). By asking the respondents to identify all the amounts they would certainly pay and those that they would certainly not pay, the method provided information about the range of values over which people are uncertain 30. In the second step, the respondents were presented with a bounded direct “OE” follow-up question and asked to indicate the maximum amount they would pay if asked to do so right now. This maximum WTP was deemed as the appropriate estimate to be used in the calculation of the WTP for a QALY, and is our central WTP estimate. This estimate was bounded by the higher and the lower value the respondents previously chose on the PS (screen 5). The combination of two WTP questions, although in the context of a bidding game, was applied before (e.g., 31, 32). The two-step contingent valuation approach was applied to arrive at a directly and precisely indicated estimate of the maximum WTP within a range of WTP which was informed by the results from the less precise, but informative and easy-to-use PS. This two-step approach also added information and potentially robustness to our findings because the respondents used two different valuation techniques within one questionnaire. The benefit of employing two different WTP formats, although in a context of two entirely separate WTP questions, was investigated by Johnson et al. 33.

Attention was also given to reducing the hypothetical bias inherent in contingent valuation exercises, through ex ante and ex post mitigation 34. Ex ante, the respondents were reminded to take their household income into consideration when solving the exercise 35. Moreover, the visual image of health states rated on the VAS remained present on the right-hand side of the screen, as a reminder of the size of the health gain being valued (see Appendix 1 at: http://www.ispor.org/Publications/value/ViHsupplementary/ViH13i8_Bobinac.asp). Ex post, the respondents were asked on which element of household spending they would economize in order to be able to pay for the painless medicine (answer options were: 1) food; 2) clothing; 3) entertainment; 4) sport; 5) savings; 6) charity; and 7) other) 36. To avoid respondent fatigue and repetition, this was asked only at the end of the first of the five scenarios. Finally, the respondents were asked to indicate the level of certainty in the answer provided. They were asked to imagine having to pay the stated amount in reality, and immediately, and the options included: 1) totally sure I would pay the stated amount; 2) pretty sure I would pay the stated amount; 3) neither sure nor unsure I would pay the stated amount; 4) not very sure I would pay the stated amount; or 5) unsure I would pay the stated amount. This follow-up question was introduced to identify a subset of responses whose valuations may more closely reflect their “true” WTP 36-38. Nevertheless, being surer in the valuation does not necessarily imply that the elicited value is “true” or that it necessarily reflects the revealed preference. It is only assumed that the stated WTP will probably deviate more from “true” WTP when the respondents are less sure about their answers.

When the respondents chose €0 as their maximum WTP, they were asked to indicate the reason behind this preference (answer options were: 1) I am unable to pay more than €0; 2) avoiding the worse health state and remaining in the better health state in not worth more than €0 to me; 3) I am not willing to pay out of ethical considerations; 4) something else [with open text field for explanation]; options 1) and 2) were considered as true WTP, options 3) and 4) as a protest answer).

The scenarios were presented in a random order to the respondents as to control for possible order bias, although such effects may not be entirely possible to eradicate 39. Still, by adopting a randomized order, the potential bias was distributed more or less evenly across the blocks.

Following the main part of the questionnaire, the respondents were asked about their socioeconomic and demographic characteristics.

The questionnaire was pilot tested in a random sample of 100 respondents in order to determine the plausibility and clarity of the tasks, the feasibility of the questionnaire as a whole, and to test the range of the PS. The respondents had several opportunities to express their opinion about the tasks at hand. The results of the pilot showed that the questionnaire was clear and feasible, with no evidence to support the claim that the task was found unrealistic. Moreover, the two-step contingent valuation exercise proved feasible. The results of the pilot did point out that the distribution and spread of the PS were not optimal; the initial scale encompassed three value categories above €2500 (i.e., €5000, €7500, and €10,000), which were never chosen. To avoid loss of information and possible anchoring to exaggerated high values, the maximum was set at €2500 for the main study and additional value categories were added to the scale around the most frequently chosen values.

Design of Scenarios

Forty-two health states were paired into 29 choice scenarios (see Appendix 2 at: http://www.ispor.org/Publications/value/ViHsupplementary/ViH13i8_Bobinac.asp) representing a fair spread of QALY gains across the utility plane (see Fig. 1). The majority of the pairs was originally applied for deriving the UK tariffs for the EQ-5D 40, and 16 out of the 29 pairs were also applied in deriving the Dutch tariffs 41. The few scenarios that were not applied in deriving the UK or the Dutch tariffs were chosen for the purpose of testing other hypotheses, on which the current study does not focus. The 29 scenarios were split into 10 blocks of five scenarios, and randomly assigned to a bit more than 100 respondents per block. Two scenarios per block were randomly assigned to one of the 10 blocks, and three were purposefully selected into blocks. These scenarios were assigned to blocks in order to ensure that in each block, the respondents encountered health gains situated on the low, middle, and high end of the utility scale (according to Dutch EQ-5D tariffs). Given the design, the changes in health between two health states (according to Dutch EQ-5D tariffs) ranged from 0.004 to 0.738 QALY, with a mean of 0.32 and a median of 0.34 QALY. Several scenarios were designed such that one health state was unambiguously better than the other.

image

Spread of gains across the utility plain.

QALY gains were also calculated from sample-specific VAS scores 28 obtained from the valuations in the questionnaire (i.e., “raw” scores); mean (and median) scores of perfect health and death were used for rescaling, based on the formula:

image

Combining the highest (€2500) and the lowest values (€10) of the PS with the minimum (0.004) and the maximum (0.738) QALY gains defined by the design produces an implicit maximum WTP of €7,500,000 (2500/0.004 * 12) and an implicit minimum WTP of €163 (10/0.738 * 12), with an implicit average WTP for a QALY of €17,862 (476.31/0.32 * 12) (see Appendix 2 at: http://www.ispor.org/Publications/value/ViHsupplementary/ViH13i8_Bobinac.asp). The ratio is multiplied by 12 because we ask about the monthly installment and a yearly health gain.

Analysis

WTP per QALY was calculated as the ratio of the WTP for avoiding the move from the better to the worse health state to the QALY difference between the two health states. This ratio was calculated for two utility elicitation techniques (i.e., using EQ-5D tariffs and EQ-VAS scores), two WTP elicitation techniques (i.e., PS and “bounded” OE formats), and for each of the five scenarios (i.e., taking the means of ratios of each individual scenario). The approach of taking the mean of ratios accounts for the individual variation in the marginal utility of income, and overall heterogeneity in preferences, because individuals' WTP for a QALY is directly imputed into the calculation of the mean. The most relevant WTP per QALY estimate was calculated based on valuations from the bounded direct “OE” follow-up question.

The heterogeneity in WTP per QALY ratios was primarily examined from the perspective of: 1) the level of household income, using the income categories presented in Table 1; and 2) the level of certainty in the WTP answers, by comparing the sample average WTP per QALY to the WTP per QALY of the respondents indicating the highest levels of certainty (pretty sure and totally sure).

Table 1. Summary statistics (n = 1091) Variable Mean SD Min Max Age 42.1 12.1 18 65 Sex (% male) 0.47 0.50 Marital status:  Married (% yes) 0.61 0.49  Divorced (% yes) 0.10 0.31  Single (% yes) 0.24 0.43  Widow (% yes) 0.03 0.16  Unknown (% yes) 0.02 0.14 Children (% yes) 0.56 0.50  Number of children (n = 3070) 2.23 10.1 1 10 Income groups:  Group 1 (% <€1000) 0.13 0.33  Group 2 (% >€999 and <€2000) 0.34 0.48  Group 3 (% >€1999 and <€3500) 0.40 0.49  Group 4 (% >€3499) 0.12 0.33 Number of people living on household income 2.44 10.4 1 20 University education (% yes) 0.36 0.48 Employment status  Employed (% yes) 0.62 0.48  Unemployed (% yes) 0.17 0.38  Student (% yes) 0.06 0.25  Housewife/husband or retired (% yes) 0.14 0.35 Health status  EQ-5D (Dutch tariff) 0.84 0.22 −0.26 1  EQ-VAS 78.5 170.1 0.00 100  Suffering a chronic illness (% yes) 0.39 0.94  Completion time of the questionnaire 18.8 60.13 9 61 Table 2. Willingness to pay (WTP) per quality-adjusted life-year (€, rounded to hundreds) All respondents: average [n = 1,091, f = 5,253]; (SD) Certainty level: pretty sure or totally sure [n = 761, f = 2,984]; (SD) 1. WTP: EQ-VAS, mean rescaled WTP: PS 9,600 (35,800) 10,400 (32,900) WTP: OE 12,900 (48,100) 13,100 (37,900) 2. WTP: EQ-VAS, median rescaled WTP: PS 12,600 (47,100) 13,700 (43,200) WTP: OE 17,000 (63,200) 17,300 (49,800) 3. WTP: Dutch EQ-5D tariffs WTP: PS 17,900 (172,100) 21,200 (181,600) WTP: OE 24,500 (213,600) 26,800 (204,300)

The theoretical validity of our results was examined with a log-linear clustered multivariate regression analysis with raw WTP estimates and WTP per QALY estimates as dependent variables. Both variables were expected to increase with the level of household income and to decrease with the number of people depending on this income, while raw WTP was also expected to increase with the size of the projected health gain. Within the multivariate regression context, we also explored if the health status of respondents and/or existence of chronic illnesses would in any way affect the WTP per QALY estimate.

Separate regressions were conducted for each of the utility and WTP elicitation techniques. Variables and their associations were compared using parametric and nonparametric tests. The results of the PS were tested for order bias by comparing the WTP estimates between samples that solved the same blocks of scenarios in different orders. The specification of the PS, and the mid-point and end-point bias, were investigated by examining response patterns on the PS, both in the pilot and the main study. The relationship between EQ-VAS and EQ5D tariffs was checked for consistency. All analysis was conducted using STATA for Windows version 10.

Results

One thousand ninety-one respondents, representative of the Dutch population according to age (18–65 years), sex, and education, participated in the survey. The description of the sample is given in Table 1. The respondents were predominately married, employed, and in very good health (EQ-5D 0.84; EQ-VAS 78.5) (39% of the sample reported suffering from a chronic condition, and although the severity of the condition was not specified, given the average score on the EQ-VAS and EQ-5D tariff, we can assume that the respondents predominately suffered from very mild or mild chronic conditions.). The sample average net household income of €2564 a month, with an average of 2.44 household members depending on that income, adequately represents the Dutch national figures for 2008 42.

WTP for Nonhealth Items

The respondents gave plausible estimates of WTP for a car (mean €10,900, median €7000), a pair of shoes (mean €109, median €80), and housing (mean €201,600, median €200,000) of their choice. From this, we inferred that the respondents understood the exercise, although the focal point of the exercise—health—may be more difficult to value, as normal (direct) market prices are absent.

Utilities

The correlation between utility scores obtained from the EQ-VAS scores and the Dutch EQ-5D tariffs was low (r = 0.24). The average health gain was 0.32 (SD 0.2; median 0.34) based on EQ-5D tariffs and 0.33 (SD 0.29; median 0.25) based on the EQ-VAS. Although the average scores do not differ considerably, there is a statistically significant difference between them (P = 0.02). The tests of consistency between EQ-VAS and the Dutch EQ-5D tariffs conducted on the level of particular health states showed that the two valuation techniques especially provided similar valuations for health states situated in the middle range of the utility scale. It was tested and confirmed that the better health states received, on average, higher valuations on the EQ-VAS. The respondents reversed the ranking (i.e., valuing the obviously worse health state higher on the EQ-VAS) on average in only 7% of scenarios.

Patterns in WTP Answers

Data inspection did not disclose any unusual patterns. Less than 1.5% of the respondents chose the highest level offered on the PS (i.e., €2500). Sixty-two respondents indicated, in one or more scenarios, that they would not pay more than €0 for a health gain (only 23 respondents indicated €0 in all five scenarios). No consistent relationship was found between the size of the health gain, household income, and zero WTP. The interpretations of zero WTP were uniformly distributed among the offered explanations, and protest answers were observed in only 1.4% of all scenarios. We therefore proceeded with the analysis without specifically considering (or excluding) these responses.

The distribution of the certainty in the provided answers revealed that the majority of respondents (56%) was either pretty or totally sure that they would actually pay the stated amount for the specified health gain; 33% indicated uncertainty, 8% was not very sure they would pay, and 3% indicated they were unsure they would pay. The majority of the respondents would give up charitable donations or savings if they needed to pay for the medicine out-of-pocket.

Test results did not indicate that the mid-point or range bias played a noteworthy role in our study. The results show a highly left-skewed distribution of values chosen from the PS. Although the results did not concentrate at one particular amount on the scale, the majority of values chosen on the scale fell between €50 and €200. Finally, the tests showed that WTP per QALY elicited when the scenario offering the largest gain was presented first in a block did not differ from WTP per QALY estimates elicited when the scenario offering the smallest gain was presented first, thus refuting the order bias.

Maximum WTP per QALY

The estimates of WTP per QALY varied considerably with the method of calculation. Table 2 provides the breakdown of WTP per QALY values according to: 1) the source of health state valuations; 2) the two steps in the WTP elicitation (i.e., lower bound of the PS, that is, the amount people definitely would pay, and the OE follow up); and 3) the level of certainty. In the bounded OE follow-up question, the respondents elicited a maximum WTP per QALY of €24,500. Estimates were systematically lower when QALY gains were calculated using the EQ-VAS scores (i.e., €12,900 and €17,000, rescaled on mean or the median). The estimates were higher among the respondents, indicating a high level of certainty in their answers, although the differences were not considerable. All estimates presented in Table 2 were statistically different from each other (P = 0.00). Finally, as an additional test, a zero WTP was assigned to responses that were “unsure” about the elicited WTP. This did not result in a significant change in the mean WTP (P = 0.07).

Subgroup Analyses

WTP per QALY varied considerably with household income in the expected direction, and reached €55,900 in the highest income group (Table 3). As noted before, the respondents indicating a higher level of certainty in their answers produced somewhat higher WTP per QALY estimates; those in the highest income group and with the higher certainty level elicited a mean individual WTP per QALY of €75,400 (using Dutch EQ-5D tariffs; see (3) in Table 3). VAS scores yielded considerably lower estimates (up to €35,300 for those in the highest income and certainty group; see (2) in Table 3). Differences in WTP per QALY were, however, only statistically significant between income group 4 and other groups (P = 0.00).

Table 3. Willingness to pay (WTP) per quality-adjusted life-year in different income groups and levels of certainty (€, rounded to hundreds) Income groups; (SD) Income groups and certainty level: pretty sure or totally sure; (SD) 1 2 3 4 1 2 3 4 [n = 139, f = 672] [n = 371, f = 1,806] [n = 440, f = 2,117] [n = 134, f = 658] [n = 86, f = 318] [n = 248, f = 964] [n = 317, f = 1,262] [n = 107, f = 440] 1. WTP: EQ-VAS, mean rescaled WTP : PS 5,000 8,200 8,800 20,800 5,000 8,500 9,100 22,200 (12,900) (41,100) (22,100) (60,900) (7,600) (30,100) (19,600) (63,600) WTP: OE 8,000 11,400 11,900 25,200 6,700 11,100 11,500 26,900 (27,300) (63,500) (28,500) (62,000) (10,500) (42,100) (22,400) (64,200) 2. WTP: EQ-VAS, median rescaled WTP: PS 6,500 10,800 11,500 27,300 6,600 11,100 12,000 29,100 (17,000) (54,000) (29,100) (80,000) (9,900) (39,600) (25,700) (83,600) WTP: OE 10,500 15,000 15,700 33,200 8,800 14,700 15,100 35,300 (36,000) (83,500) (37,500) (81,600) (13,900) (55,300) (29,400) (84,400) 3. WTP: Dutch EQ-5D tariffs WTP: PS 8,200 14,300 15,100 47,100 8,000 11,100 17,400 63,900 (34,100) (178,300) (90,700) (349,200) (29,600) (47,900) (111,900) (426,200) WTP: OE 12,600 18,000 21,100 55,900 11,100 15,100 22,900 75,400 (287,600) (182,400) (128,400) (369,000) (41,300) (60,000) (156,300) (450,000) What Would You Not Pay?

On the PS, the respondents indicated that the minimum amount they were certainly not willing to pay for a QALY was €43,160 (see (3) in Table 4). Again, estimates were higher for respondents who were more certain in their answer (up to €48,600), for higher income groups (up to €86,100), and these characteristics combined (up to €114,900). Using VAS scores in the calculation of the amount the respondents were not willing to pay for a particular gain yields the estimate of up to €54,000 (see (2) in Table 4).

Table 4. Willingness to pay (WTP) per quality-adjusted life-year upper bound, average, and different income groups and levels of certainty (€, rounded to hundreds) All respondents: average [n = 1,091, f = 5,253]; (SD)

留言 (0)

沒有登入
gif