Quality appraisal of clinical practice guidelines addressing massage interventions using the AGREE II instrument

Literature search and selection

The searches retrieved 5389 hits, of which, we excluded 1065 duplicates and 4264 records after screening titles and abstracts, leaving 60 full-text articles that were screened and 49 full-text articles were assessed for eligibility. Of the 60 full texts, we excluded 11 articles for the following reasons: 1 was a systematic review of CPGs related to massage, 1 was not in English nor Chinese languages, 1 was an abstract, 1 was by consensus process, 4 were original versions, and 3 were not available as full text. The process for selecting the articles is presented in Fig. 1. The ggplot2 package of R studio (v 2022.12.0-353) was used for raincloud plotting, and the bubble plot depicting the assessment results of guidelines and CPGs concerning different disease types was processed in Bioinformatics (https://www.bioinformatics.com.cn/) [27].

Fig. 1figure 1

Flow chart of the selection process

Characteristics of included CPGs and consensus

Forty-nine articles were included; 36 (73.5%) of them were CPGs and 13 (26.5%) were expert consensus. The included guidelines/consensus were mainly developed from organizations or societies, a majority of which were located in America (25, 51.0%). The CPGs were published between 2006 and 2022, and 20 CPGs (11.54%) were updated versions. Among these, 18 (36.7%) guidelines used Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to assess the certainty of the evidence, and the other 18 (36.7%) used GRADE to assess the strength of the recommendation. The eligible CPGs and consensus characteristics are illustrated in Table 1 and Supplementary Material Appendix 2.

Table 1 General characteristics of included CPGs and consensusAGREE II quality scores

The ICC analysis showed very good agreement among the three reviewers [ICC = 0.993, 95% CI (0.992, 0.995)].

Evaluation results for the overall quality of 36 CPGs showed that 4 (11%) were “good quality”, 15 (42%) were “sufficient quality” and 17 (47%) were “lower quality”. The AGREE II quality scores of domains ranged from 0.30 to 0.75 (see Fig. 2). The domain with the highest score across the guidelines was scope and purpose, with a median of 0.75 (0.52~0.91). The stakeholder involvement domain [median: 0.39 (0.31~0.56)] and application domain [median: 0.30 (0.17~0.47] obtained lower scores. Each domain presented different results in various guidelines (see Table 2). AGREE II scores of each eligible CPG/consensus are presented in Supplementary Material Appendix 3.

Fig. 2figure 2

AGREE II assessment by domain of 36 guidelines. The raincloud plot with Mean score ± 95% CI comprehensively depicts the distribution of the AGREE II score of the guidelines in each domain. Each dot exhibits the standard value combined assessment of three researchers concerning each guideline

Table 2 AGREE II assessment scores of six domains of eligible CPGsScope and purpose

The average score of the six included guidelines in terms of the scope and purpose domain was 0.73, 95% CI (52.0~92.5), ranging from 0.26 to 1.00 [28,29,30,31,32,33]. Most eligible guidelines comprehensively described the overall purpose, health questions and target populations, except for 5 guidelines [13, 34,35,36,37] that did not describe the health intents, expected benefit/outcome, or target population, three guidelines [36,37,38] that did not provide a detailed description of PICO questions, i.e., population, intervention or exposure, comparative and study outcomes, and 1 guideline [39] that did not explicitly describe the details of the target population.

Stakeholder involvement

The overall score in this domain was low; the average score was 0.44, 95% CI (31.0~56.8). All CPGs reported comprehensive member information of the guideline development group. Ten CPGs [11, 12, 33, 34, 36, 38, 40,41,42,43] did not mention the patient’s views and preferences, while the target users were not clearly defined in nine CPGs [34, 36, 39,40,41, 44,45,46,47].

Rigor of development

The mean score for this domain was 0.55, 95% CI (44.3~66.5). Twenty-four guidelines scored above 50%, and three guidelines scored below 25%. Five guidelines [35,36,37,38, 48] did not provide detailed search strategies. The inclusion/exclusion criteria were explicitly described in eight guidelines [12, 32, 40, 41, 49,50,51,52]. Most guidelines clearly described the strengths or limitations of the body of evidence and health benefits, harms or risks of side effects. Four CPGs [37,38,39, 48] did not report criteria for rating evidence, and seven CPGs [11, 34, 37, 41, 48, 49, 53] did not address the methods for formulating the recommendations. Two CPGs [38, 42] did not mention benefits, harms or the balance between them. Only 1 guideline [31] provided comprehensive updated information.

Clarity of presentation

In the clarity of presentation domain, the mean score was 0.55, 95% CI (44.0~66.5). We found that two CPGs [32, 42] lacked specific and unambiguous recommendations. In five CPGs [13, 14, 34, 42, 54], multiple options with detailed population or clinical situation descriptions were provided for each targeted question, and key recommendations were presented in unclear ways, i.e., they could not be clearly recognized in the texts of those CPGs [32, 42, 49].

Applicability

The score for the application domain was 31.9% ± 21.2%, 95% CI (17.0~48.3). The potential resource implications, details of the described facilitators, or barriers to application were not clearly defined in most CPGs, except for six CPGs [12, 13, 32, 33, 36, 47] that identified the types of facilitators and barriers. Facilitators included a wide variety of locations for therapy implementation [12, 13], supportive policy from the local government, standardized training procedures provided to the practitioners [13], etc. Several barriers mentioned in those CPGs which might impact the guideline implementation, such as lack of availability in community hospitals [33], loss of skill over time from disuse, inadequate office space [32], etc.

Four guidelines [14, 28, 33, 55] mentioned information regarding the facilitators and barriers to implementing recommendations, and five guidelines [31, 32, 39, 55, 56] provided advice and tools on how the recommendations could be put into practice. In addition, only three guidelines [56,57,58] fully considered the potential resource implications of applying the recommendations, and two guidelines [51, 57] completely presented performance monitoring indicators and auditing criteria, including advice on the frequency and interval of measurement descriptions and operational definitions of how the criteria should be measured.

Editorial independence

This domain obtained a mean score of 56.1% ± 28.2%, 95% CI (33.0~83.0). Four CPGs [12, 45, 46, 59] did not state that the views of the funding body had not influenced the content of the guidelines, 4 CPGs [11, 35, 41, 49] did not present the conflicts of interest of the guideline development group members, while 1 CPG failed to declare both [37].

The overall assessment ratings for the 13 consensuses evaluated ranged from 0.06 to 0.46. All consensus were classified as “lower quality”. For 13 consensuses the average scores of AGREE II domains 1–6 were 33.2%, 18.0%,19.4%, 9.83%, 18.0% and 26.9%, respectively (see Table 3). It shows that each domain needs to be improved. The consensus lacks recommendations, which leads to a low rating in the ‘Clarity of presentation’ domain. Comparing with consensus, the development of CPGs is more rigorous, structured and reliably organized.

Table 3 AGREE II evaluation results of guidelines and consensusesLevel of evidence and strength of recommendation

Thirty-four CPGs (83.33%) used 10 types of grading systems to rate the level of evidence and the strength of recommendation (see Table 4). The GRADE system with wider acceptance was adopted in the development of 16 CPGs. The grading system of evidence and recommendation was not reported in 2 guidelines [39, 48]

Table 4 Grading system of evidence and strength of recommendationRecommendations for massage interventions

We included 11 massage-specific CPGs/ consensuses and 38 disease-based CPGs/ consensuses with recommendations in terms of massage.

General view of the recommendations

A total of 119 massage-related recommendations were extracted from 36 guidelines. It included “in favor” (102, 85.7%), which meant the massage was recommended for use. For instance, in the CPGs applied GRADE [13, 14], in favor was divided into “strong” or “weak” levels; and the same for “against” (9, 7.6%), which meant the massage wasn’t recommended for use. It was also divided into “strong” or “weak” based on GRADE according to CPGs [

留言 (0)

沒有登入
gif