Chinese cross-culturally adapted patient-reported outcome measures (PROMs) for knee disorders: a systematic review and assessment using the Evaluating the Measurement of Patient-Reported Outcomes (EMPRO) instrument

Identification of knee studies

A search was performed of the earliest records up to 22/08/2020 according to guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses [9] (see PRISMA checklist). The following databases were selected: PubMed/MEDLINE, EMBASE (OVID), CINAHL (EBSCO), and CNKI (Chinese database). The search strings used were designed with MeSH terms and combinations of keywords based on previously described and documented strategies for PROM searches [3, 10, 11] (see Additional file 1), which were then tailored to the knee anatomical region and Chinese population. The publication languages for the articles were English and Chinese.

Screening of articles and instruments

The screening was carried out as a three-step process (Titles, Abstract, and Full texts) and performed independently by two reviewers. Outputs were compared and a consensus was reached. After full-text screen and identification of suitable articles, we manually reviewed the in-article reference lists for potentially relevant articles missed during the electronic search.

Based on Population, Intervention, Comparison, Outcome (PICO) criteria, the following inclusion criteria were adopted: (1) cross-culturally adapted and translated knee PROMs tested in the Mainland Chinese population; (2) knee-specific PROMs evaluating interventions for knee disorders; and (3) PROMs restricted to the Chinese Mainland and written in simplified Chinese.

Exclusion Criteria were (1) PROMs written in traditional Chinese; (2) PROMs tested on populations out with Mainland China; and (3) articles not meeting inclusion criteria.

Evaluating the Measurement of Patient-Reported Outcomes (EMPRO)

The EMPRO instrument consists of 8 attributes and 39 items, designed for quality assessment of PROMs: Conceptual and measurement model (items 1–7), Cultural and language Adaptations of the instrument (items 8–10), Reliability (items 11–18), Validity (items 19–24), Responsiveness (items 25–27), Interpretability (items 28–30), Burden (items 31–37), and Alternative modes of administration (items 38–39) [5].

Quantitative assessment for each item is via a 4-point Likert scale, graded from 4 (strongly agree) to 1 (strongly disagree). Alternative option boxes are ‘No information’ and ‘Not applicable.’ A short free-comments box is included for appraisers to document the rationale for item grading. The appraiser is required to provide an overall recommendation for the PROM use according to the following response scale: ‘Strongly recommended,’ ‘Recommended with provisos or alterations,’ ‘Would not recommend’ and ‘Not enough information.’

The EMPRO tool requires a license application via the portal www.bibliopro.org, which is free-to-use, and presently available in two languages (English and Spanish).

Standardized and systematic evaluation

Following the systematic review, the specific instruments under investigation were identified from full-text articles. For PROMs of non-Chinese origin, the original development publication was also retrieved (see Additional file 2). A standardized assessment of the adequacy of their measurement properties was undertaken using the EMPRO tool. According to recommendations from the designers, two reviewers (both clinicians with a background in PROMs research) performed the assessment. Both had completed the online EMPRO training webinar (https://www.isoqol.org/category/webinar/page/3/). The assessment was carried out in two phases. The first phase consisted of each reviewer independently scoring article(s) supporting each cross-culturally adapted Knee PROM for methodological attributes, as well as the article describing the original design of the PROM for the conceptual and measurement model assessment. The second phase which followed a consensus method recommended by the EMPRO designers involved discussions between reviewers on discrepancies to obtain a common score for each item [5]. Reviewers were based on two continents and did not converse on scoring until the discussion phase.

After the first phase of independent scoring by reviewers, an agreement between them was assessed by using a two-way, random, single unit, absolute agreement intraclass correlation coefficients ICC [12]. The degree of reviewer agreement was categorized based on Cicchetti (1994): ICC < 0.40 poor; 0.40–0.59 moderate; 0.60–0.74 good; and 0.75–1.00 excellent [13].

Scoring and analysis

Scoring of the methodological attributes was calculated based on developers’ instructions; https://www.isoqol.org/?s=Empro. Specifically, attribute-specific scores are obtained by calculating the response mean of the applicable items when at least 50% of them are rated; and items check-marked with the option ‘no information’ are assigned a score of 1 representing the lowest possible item score. The response means for each attribute are then linearly transformed to a range of 0–100 (worst to best). Global scores (based on metric properties) are only calculated when at least three attributes can be scored. Attributes without information are imputed zero. Panoramic assessment (which includes all culture/language versions of the instrument) involves conceptual model, reliability, validity, responsiveness, and interpretability, while culture/language-specific evaluation involves conceptual model, reliability, validity, responsiveness and interpretability, and cross-cultural adaptation. The EMPRO domains are elaborate and strictly designed to avoid ceiling effects, making a score of 100 (maximum score) difficult to obtain; thus, a score of 50 (half of the maximum) is considered to be an acceptable threshold [5, 14]. We applied this minimum threshold in our result analysis.

Analysis and graphics were designed with Microsoft Excel 2003 (Microsoft, Redmond, WA, USA). Differences in scores between EMPRO attributes were compared using the nonparametric Mann–Whitney test. Inter-rater reliability was performed using SPSS® Version 20.0 software (SPSS Inc., Chicago, IL, USA).

留言 (0)

沒有登入
gif