Evaluation of the EQ-5D-5L, EQ-VAS stand-alone component and Oxford knee score in the Australian knee arthroplasty population utilising minimally important difference, concurrent validity, predictive validity and responsiveness

Total knee arthroplasty (TKA) is a safe and cost-effective surgery for patients with osteoarthritis who do not respond to medical therapy alone [1] and in Australia, a total of 54,102 replacements were performed per year from 2017 – 2018 (218 per 100,000). [2] Despite the well-established safety data and patient improvements published over the last 20 years [1], the measurement of patient-related outcomes, including functional change or improvement, are not as clear-cut for TKA compared to other orthopaedic surgery such as total hip arthroplasty. [3, 4].

Patient-reported outcome measures (PROMS) are used as a measurement tool to evaluate patient and health economic outcomes, with an example being the 5-level version of the EuroQol 5 Dimensions (EQ-5D-5L index score). This standardized health-related quality of life (HRQoL) questionnaire was initially developed in 1990 as a 3-level version designed to assess general health for five dimensions. [5, 6] In 2011, it was revised to a 5-level version (EQ-5D-5L index) with five levels and five dimensions to reduce granularity in health response and reduce the ceiling effect. [7] The EQ-5D questionnaires are some of the most widely used PROMs globally; in some countries, such as the United Kingdom, it is used to calculate quality adjusted life years used in cost-utility analysis [8,9,10].

While extensively used in other parts of the world, the EQ-5D-5L index score has not yet been well validated for the Australian orthopaedic population for HRQoL assessment. [11] The results of the EQ-5D-5L index score PROM are converted into vectors which are five-digit codes representing a health state. For example, 11,111 is full health, and 55,555 represents the worst health. There are 3,125 possible health states. These are mapped onto a single utility index using a country-specific value set. To date, more than 25 countries have validated country-specific EQ-5D-5L value sets for various patient populations. [12].

The EQ-VAS is a stand-alone component of the EQ-5D-5L index, in which a patient self-reports their impression of their general health and functionality. Compared with the in-depth, question-and-answer format of the ED-5D-5L index, the EQ-VAS is seen as a simpler and less ambiguous format. [13] The Oxford Knee Score (OKS) is a validated PROM specifically developed to assess function and pain in patients undergoing TKA. [14] It had been utilised to assess the concurrent validity of the EQ-5D-5L index in TKA patients in other countries. [15].

The minimally important difference is defined as the smallest PROM score change, which is perceived significantly by patients or clinicians. [16] The MID is 'anchored' by using a satisfaction survey to identify patients who experienced a change in their functional status considered perceptible and clinically important. Changes in functional status were measured using a five-point Likert scale at one year postoperatively scored as either (1) "very satisfied", (2) "satisfied" (3) "neither satisfied nor dissatisfied", (4) "dissatisfied", or (5) "very dissatisfied". Patients whose functional change was 4 or 2 were considered to have experienced some change equivalent to the MID. [17] It is generally considered that the anchor-based approach is the optimal method for evaluation of MID as it yields a direct expression of the patient’s preferences and values. [16] The distribution-based method of MID estimation assesses the distribution of scores around the mean of the measurement of interest, for example standard deviation. [18].

Concurrent validity describes the extent of the method being tested to assess an outcome correlates with an established method to measure the same. Here the EQ-5D-5L index will be tested against the established OKS. Predictive validity describes the association between baseline and follow-up outcomes which is highly valued in this cohort, as it has implications for surgical suitability for individual patients. Responsiveness, a measure of the sensitivity of PROMs to reflect a change in health status over time, is also tested.

Outcome measure

This study aims to compare the EQ-5D-5L utility index and EQ-VAS against the OKS in Australian patients undergoing total knee arthroplasty using the minimally important difference (MID), concurrent and predictive validity.

Patients and methods

This multi-centre prospective trial was conducted at two large tertiary teaching hospitals in Adelaide, Australia. A group of orthopaedic surgeons operate routinely at both sites, performing approximately 300 knee arthroplasty surgeries annually. However, the number of patients operated on in 2020 was reduced to approximately 150 due to SARS Covid-19-related restrictions. The local governing Human Research Ethics Committee granted multi-centre approval (SALHN/329.17).

All consecutive adult patients undergoing elective total knee arthroplasty surgery were prospectively enrolled over a nearly three-year period from 8th January 2018 to 1st of October 2020, with a six-month follow-up until 2nd April 2021. Indication for surgery was predominantly osteoarthritis, all joint replacements were primary operations only. Informed consent was obtained from all participants, and baseline demographics were recorded for all patients, including age, gender, body mass index (BMI) and the Charlson comorbidity index (CCI) [19, 20].

Data were recorded at three different time points (preoperatively, six weeks and six months postoperatively) by one dedicated research assistant, using scripted questionnaires via telephone or a written survey sent by postal mail. At all three time points, two validated PROMs were used: the Oxford Knee Score (OKS) [21] and the EQ-5D-5L index score [5] including the EQ-VAS stand-alone component. Data were keyed into a password-secured database and stored on the hospital computer network.

Patients were included for analysis if they had complete quality of life data. This was defined as completing the EQ-5D-5L index score and OKS for the three time points.

Oxford knee score

The OKS is a joint-specific PROM [22, 23] which has been extensively utilised over the last 20 years. It assesses six fields (pain, walking, physical activity, function, quality of life and psychological wellbeing), with each field containing 2 questions, making up a total of 12 questions. Each question is scored on a 5-point discrete visual analogue scale where higher scores indicate better function. The final score is a sum tally of the individual question scores, with a range of 0 to 48. The OKS has previously been utilised as a comparator for responsiveness with PROMs such as the EQ-5D-3L and SF-12 in a similar patient population, albeit in different countries than Australia. [24, 25].

EQ-5D-5L index and EQ-VAS

The EuroQol Group designed the EQ-5D-5L index to quantify general health in adults. Using a 5-point scale (none, slight, moderate, severe and extreme/unable to perform), it evaluates the fields of mobility, self-care, usual activities, anxiety/depression and pain/discomfort. Based on the general Australian population, preference weights can be attached to each of the EQ-5D-5L health states. These were determined through a discrete choice experiment approach [26]. Utility indices vary from − 0.676 to 1, with higher utilities signifying a better HRQoL.

The EQ-VAS is a vertical visual analogue scale which constitutes a part of the EQ-5D-5L index score and can also be used as a stand-alone component. Patients are to rate their general health from 0 to 100, with higher numeric scores denoting a better function. The EQ-5D-5L index questionnaire is established on specific national value sets or the generic Western Preference Pattern. [27] It has been validated in approximately 28 countries as of 2022 [28,29,30,31].

Statistical analysis

All statistical analyses were performed utilising STATA version 17 (StataCorp, Texas, USA). Continuous variables (age, BMI, CCI) were expressed as means and standard deviations. The categorical variable (gender) was expressed as percentages (counts). A p-value of < 0.05 was considered statistically significant.

Concurrent validity, predictive validity and agreement

For analysis of concurrent validity, Spearman's correlation coefficient (rho, ρ) was utilised to compare the EQ-5D-5L index and EQ-VAS against the OKS. The strength of the relationship can be assessed as low/weak (ρ < 0.25), fair (ρ = 0.25 to < 0.50), good (ρ = 0.50–0.75), or excellent (ρ > 0.75). This magnitude of rank order correlations was sourced from previous publications on the same area. [32, 33].

Predictive validity was ascertained using a regression framework, whilst controlling for confounders. We utilised generalized linear models with the 6-week and 6-month postoperative PROMs as the dependent variable, and the preoperative values and baseline characteristics as independent variables. Depending on the distribution of the dependant variable, the most appropriate distribution family and canonical link function were chosen. Multiple families (including the Gaussian, inverse Gaussian, Poisson, and Gamma distributions) were trialled when there was difficulty ascertaining the appropriate family of distribution. The best fitting model was then selected based on low Akaike's Information Criteria and Bayesian Information Criteria scores. The average marginal effect with respect to preoperative score was used to compare models if different distribution families were utilised.

The agreement between the EQ-5D-5L index and the OKS was measured using Bland–Altman analysis at all three measurement points.

Responsiveness

Responsiveness is a measure of the sensitivity of PROMs to reflect the change in health status over time. For this study, we compared measurements at baseline, 6 weeks and 6 months follow-up using paired t-tests. Further assessment of responsiveness was quantified using effect size (ES) and standardized response mean (SRM).

The effect size was calculated using the formula: effect size equals the mean difference from baseline divided by the standard deviation at baseline.

The standard response mean was calculated using the formula: standard response mean equals mean difference from baseline divided by the standard deviation of difference.

ES and SRM were classified according to Cohen’s rule of thumb, as large (≥ 0.8), moderate (0.5–0.79) or small (< 0.5). [34] Both ES and SRM are standardized measures of change over time in health, independent of sample size.

Influence of baseline characteristics on PROMs

Regression analysis of the baseline characteristics (age, gender, BMI and CCI) was performed using generalised linear models with the preoperative EQ-5D-5L index, EQ-VAS and OKS as independent variables. The preoperative PROMs were used as the dependant variables, and depending on the distribution, an appropriate distribution family and canonical link function were chosen using the same approach taking in the predictive validity analysis. The coefficient, standard error and p-values were recorded.

Determination of minimally important difference

Minimally important difference (MID) is defined as the smallest change in score, which is perceived as important by patients or clinicians. [35] The MID for the cohorts was defined as the change in PROM score for patients who responded as satisfied [2] or dissatisfied [4] to the anchor question at one year. The MID was determined using two approaches: distribution-based approach, and the anchor-based approach.

The distribution-based approach defined MID as half the baseline standard deviation of the PROM scores [36] For both the anchor-based approach, we quantified satisfaction based on the anchor question (satisfaction rating). We then calculated Spearman's correlation coefficient to assess the correlation between the measured score and the satisfaction rating. The MID calculation would not be performed if the correlation coefficient was less than 0.25. While calculating the MID using the anchor-based approach, we considered a satisfaction score of 2 or 4 as having experienced some MID-equivalent change. The MID was then taken as the mean changes in scores of the patients who scored 2 or 4.

留言 (0)

沒有登入
gif