Evaluation of the use of GRADE in dentistry systematic reviews and its impact on conclusions: a protocol for a methodological study

We will conduct two studies to assess the frequency (objective 1) and the implications (objective 2) of the use of GRADE in the current dental literature. We adhered to all sections of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols (PRISMA-P) statement that applied to our methodological study (see Additional file 1) [11].

Search strategy

We will utilize one search strategy to retrieve potentially eligible SRs for both studies.

We will perform a search in Ovid MEDLINE from January 1, 2016, to the present day. We will use search filters from the Health Information Research Unit (HIRU) of McMaster University as well as the Medical Subject Heading (MeSH) “dentistry” to search for SRs [12]. There will be no language restrictions in our search strategy. Our final search strategy (Table 1) will be reviewed by a methods expert (R.B.-P.).

Table 1 Ovid MEDLINE searchScreening process

For both studies, we will screen the titles/abstracts and full texts of the retrieved citations independently and in duplicate using Covidence. Conflicts will be resolved through discussion or by a third reviewer when necessary. Eligibility criteria will be different for each study and are described below.

Study sample and random sampling of citations

We will screen all citations retrieved from our search in MEDLINE at the title and abstract screening stage. We will then assign a random number to each title and abstract meeting the inclusion criteria using Microsoft Excel. We will complete full-text screening according to the random number assigned to each study starting from the lowest number and working in ascending order until the number of included full-text meets our target sample size.

To obtain an informative sample, we aim to include a minimum of 50 SRs that use GRADE. Given the findings of a previous study which found that nearly 30% of oral health SRs used GRADE, we used a more conservative estimate of 25% and determined that our target sample size should be 200 SRs to allow for at least 50 SRs using GRADE [9].

Study 1: Assessment of the frequency of the utilization of the GRADE approach in the recent dental literature

The following are the objectives:

A.

To determine the frequency of the utilization of GRADE in dentistry SRs

B.

To summarize the frequency of the levels of certainty determined by GRADE assessments conducted in dentistry SRs

C.

To assess whether GRADE is being used appropriately at both the review and outcome level (for the primary outcome) in dentistry SRs

To evaluate whether SRs using GRADE differ from those that do not use GRADE with regard to methodological quality

Eligibility criteria Inclusion criteria

We will include SRs of interventions in dentistry, published in English, which included only RCTs.

We will consider SRs to be studies in which either one of the two following criteria are met:

1.

The authors refer to the study as either a SR or meta-analysis and search at least one electronic database for published studies.

2.

The authors search at least one electronic database for published studies and use well-defined eligibility criteria. We will consider eligibility criteria to be well-defined if it comments on all of the following:

(a)

The study designs to be included in the SR

(b)

The population of interest for the research question (e.g., patient characteristics, specific indication for treatment)

(c)

The intervention(s)/comparator(s) the authors aim to investigate

In order to be considered a SR in dentistry, one of the following conditions must be met:

The SR includes studies in which patients receive treatment for an oral pathology or undergo an oral health-related procedure.

The SR includes studies in which one oral health-related intervention is compared to another, placebo, or standard care.

Exclusion criteria

SRs which conduct network meta-analyses (NMAs)

SRs which find no evidence and therefore fail to include any studies

SRs which are published in combination with another type of study (e.g., case study/series, health technology assessment, clinical practice guidelines)

Data extraction

Pairs of reviewers will extract data from eligible studies independently and in duplicate using forms created in Microsoft Excel. Reviewers will undergo a data extraction calibration exercise of three SRs per reviewer and pilot the standardized extraction sheet prior to the start of extraction. We will resolve conflicts through discussion or by consulting a third reviewer.

Data to be extracted from each SR will include general characteristics including title, author(s), journal, year of publication, country of authors, and, if applicable, dentistry specialty or specialties [13]. For SRs conducting GRADE assessments, reviewers will also identify the primary outcome of each SR, which is the outcome defined as such by the authors or the outcome first listed in the methods section. If there are multiple primary outcomes defined by the SR authors, we will use the first outcome mentioned. If the methods section does not clearly describe the outcomes, we will consider the first outcome mentioned in the results section to be the primary outcome. Additionally, if a review assesses multiple comparisons for the primary outcome, we will only consider the results of the first comparison described in the results. If the primary outcome is assessed at multiple time points, we will consider only the results of the shortest time point. We will also extract data on the methodology of each SR, including the methods of searching, screening, and data extraction as well as the results of the SRs, including the outcomes analyzed and the number of included RCTs. In order to allow us to select an outcome of interest for study 2, we will also extract data on whether the SR authors conducted and reported the results of an RoB assessment and whether they report the number of participants analyzed for narratively reported outcomes.

Regarding GRADE, we will extract the extent to which it was used in each SR (for all, some, or none of its outcomes), whether summary of findings tables were used, whether GRADE was used for all outcomes that were meta-analyzed, and whether GRADE was used for outcomes that were not meta-analyzed. We will also determine whether the SR authors refrain from making recommendations or statements about whether an intervention should or should not be used in clinical practice. We will search for potential recommendations in the conclusion, discussion, and abstract of the SR. We will extract additional data on the GRADE assessments for the primary outcome including the final certainty of the evidence rating, ratings and explanations for each GRADE domain, and additional information to allow us to determine whether GRADE was used appropriately at the outcome level. We will note any other issues with the GRADE assessments of the SR authors as part of our evaluation of whether GRADE was used appropriately.

We will also extract whether GRADE assessments were incorporated into conclusions about the primary outcome in the abstract and body of the SR. We will define a conclusion as a statement in which the authors interpret their results by stating whether the intervention(s) has beneficial or harmful effects relative to, or is no different from, the comparator(s) or stating that there is a lack of evidence regarding the outcome. We will first extract the conclusions about the primary outcome from the abstract. If there is no conclusion section in the abstract, we will extract any conclusion statements from the results of the abstract. We will also extract conclusions about the primary outcome from the body of the SR, referring to the SRs designated conclusion section to minimize subjective judgments. If there is no conclusion section, we will extract the conclusion from the discussion section. Finally, if there is no clear conclusion statement in any of the aforementioned sections, we will not assess the conclusions of the SR but will still incorporate the SR in our other analyses (e.g., percentage of use of GRADE).

A summary of the data extraction fields can be found in Table 2. Should further data necessitate extraction, we will modify the standardized form, extract this new data for all eligible studies, and report these protocol modifications in the final publication.

Table 2 Data extraction fieldsData analysis

All retrieved articles will be presented in a study selection flow chart and the data of eligible studies summarized in tables.

For determining how frequently GRADE is used in dentistry SRs (objective A), we will first conduct a descriptive analysis. We will calculate the percentage of SRs using GRADE overall, by year, and by dental specialty from our entire sample of studies. Additionally, for SRs using GRADE for at least one outcome, we will calculate the percentage of SRs using GRADE for outcomes in which no meta-analysis was conducted [14]. Finally, we will determine how frequently the authors incorporate GRADE into the conclusions of the SRs’ primary outcome in both the body of the SR and the abstract.

For summarizing the frequency of the levels of certainty determined by GRADE assessments in dentistry SRs (objective B), we will determine the percentage of high, moderate, low, and very low certainty evidence among the primary outcomes of each SR and stratified by dental specialty. To evaluate which limitations are more likely to lead to lower certainty evidence in the current literature, we will also quantify the frequency of concerns that lead to rating down the certainty of the evidence for each GRADE domain.

For assessing whether GRADE is being used appropriately (objective C), we will conduct two separate evaluations: at the review level and at the outcome level for the primary outcome of each SR [6, 15,16,17,18,19,20]. We will determine the percentage of SRs using GRADE appropriately at each level using the criteria outlined in Table 3 [6, 15,16,17,18,19,20].

Table 3 Checklist for determining whether GRADE was used appropriately

We will evaluate whether SRs using GRADE differ from those that do not use GRADE with regard to methodological quality (objective D). To evaluate the methodological quality of each SR, we will refer to two aspects of the ROBIS tool [21]. First, we will determine whether the search strategy was comprehensive. A search strategy will be considered comprehensive if it searches for published and unpublished reports (by specifying grey literature databases or searching for unpublished reports through any other means) (ROBIS question 2.1) [21]. Second, we will assess whether efforts were made to minimize errors during screening (i.e., title/abstract and/or full-text screening) as well as data extraction (ROBIS questions 2.5, 3.1) [21]. As we anticipate poor reporting of the methods used for screening, we will consider any mention of conducting screening independently and in duplicate or by having a second reviewer check the work of another to be minimizing errors. For data extraction, as described in the ROBIS tool, SRs for which this process is conducted independently and in duplicate or by having a second reviewer check the work of another reviewer in detail will be considered to be minimizing errors [21]. We have chosen to utilize these two domains of ROBIS to maximize objectivity as the remaining domains require more judgment contextualized to each SRs clinical question. We will use the odds ratio and its 95% confidence interval (CI) to determine whether SRs using GRADE are more likely to (1) have a comprehensive search strategy that considers grey literature and (2) take steps to avoid errors in screening and data extraction.

Study 2: Impact of the presence of GRADE assessments on the conclusions of dentistry-related systematic reviews

The following are the objectives:

A.

To determine whether a lack of certainty of the evidence assessments is a predictor of inappropriately formulated conclusions in SRs

B.

To determine whether the use of GRADE changes the conclusions of dentistry SRs which do not utilize the tool

Outcome of interest

To conduct this study, we will focus on a specific outcome across all SRs. We will determine the outcome of interest based on the following criteria:

1.

The outcome of interest will be the outcome most frequently reported within our sample of SRs and for which the following information is also available:

(a)

The findings of the RoB assessment conducted by SR authors

(b)

The effect estimate with its 95% CI or number of participants analyzed

This outcome will be selected upon completion of data extraction for study 1, which will allow us to map the outcomes frequently investigated in the sample. This outcome must meet the aforementioned requirements as these will be necessary to conduct GRADE assessments necessary for study 2. Given that oral health SRs have been found to be most frequently downgraded in the study limitations and imprecision domains, [9] the aforementioned criteria are the minimum that our review team will require to conduct GRADE assessments. We selected a single outcome that is most frequently investigated to make it feasible for our review team to conduct GRADE assessments.

Eligibility criteria Inclusion criteria

SRs eligible for this study must meet all of the eligibility criteria outlined above for study 1, in addition to reporting on the outcome of interest.

Data extraction

All data will be extracted independently and in duplicate using a piloted data extraction form. Reviewers will begin extraction upon completion of a calibration exercise. For all eligible SRs, we will extract the SRs’ conclusion for the outcome of interest alongside additional data including whether the conclusions made by study authors relied on statistical significance, included recommendations, and considered if there were any limitations. For SRs not using GRADE, we will extract the minimum information needed for our team to make a GRADE assessment. For SRs where a meta-analysis was conducted for the outcome of interest, this will include the results of the meta-analysis. In cases where the outcome of interest is summarized without a meta-analysis, we will extract the list of RCTs analyzed for the outcome, the number of participants analyzed overall, the SRs’ narrative summary of the analysis, and any effect estimates provided for each of the individual RCTs. In the case where the outcome of interest was measured at multiple time points or investigated for multiple comparisons, we will only consider the results of the shortest time point and the first comparison listed in the results. We will also extract the results of the RoB assessment, identify the level of contextualization used to assess imprecision, and determine whether there was any evidence of publication bias or indirectness for the outcome of interest. The additional data extraction fields for this study can be found in Table 4.

Table 4 Additional data extraction fields for the outcome of interest in study 2Data analysis

We will use a study selection flow chart to present the retrieved articles and tables to summarize the characteristics of eligible studies.

To determine whether a lack of certainty of the evidence assessments is a predictor of inappropriately formulated conclusions in SRs (objective A), we will first evaluate the conclusions made by all the SRs for the outcome of interest, irrespective of whether they use GRADE. Two reviewers will independently assess these conclusions to determine whether they are appropriately formulated, and conflicts will be resolved through discussion or by a third reviewer where needed. Once all conclusions have been classified as appropriately formulated or not, we will use the odds ratio and its 95% CI to evaluate whether SRs using GRADE are more likely to formulate appropriate conclusions compared to SRs not using GRADE.

A conclusion will be considered to be appropriately formulated if it meets all the following criteria:

The conclusion is justified by the results presented in the review.

The conclusion does not rely on statistical significance [21].

The conclusion considers if there are any limitations.

We will consider SR authors to have addressed limitations by stating whether or not the results are impacted by any number of factors (e.g., low quality of RCTs, heterogeneity, small sample size, publication bias, short follow-up time in RCTs) or referencing their GRADE certainty of the evidence rating.

The conclusion does not make recommendations [22].

To determine whether the use of GRADE changes the conclusions of dentistry SRs (objective B), reviewers will evaluate the conclusions of a subset of the study sample which does not utilize the GRADE approach and report on the outcome of interest. First, we will classify the authors’ conclusions in terms of their certainty as either definitive or recognizing uncertainty (addresses any limitations of the evidence by means of a GRADE assessment or through some other means). Examples of other ways to recognize uncertainty include stating that the results should be interpreted with caution due to high risk of bias, heterogeneity, small sample size, etc., stating that there were limitations in the evidence, stating that there is insufficient evidence to draw a conclusion, or stating that further high-quality studies are needed. We will also identify how the effect size is categorized in the SR conclusions according to the level of contextualization used by the authors of each SR (i.e., minimally contextualized or partially contextualized) [23]. A minimally contextualized approach will be defined as an approach that focuses on whether an important effect exists (i.e., negligible/trivial/no difference or important difference between interventions), while a partially contextualized approach defines the magnitude of the effect (i.e., negligible, small, moderate, or large) [23]. We will assume a minimally contextualized approach is used unless the authors explicitly state using a partially contextualized approach or if this can be inferred from their conclusions as they refer to different magnitudes of effect. If the SR authors rely on statistical significance to classify the effect size, we will consider this to be a minimally contextualized approach. Conclusions will be classified by two reviewers until a consensus is reached.

Second, after classifying the authors’ original conclusions, the review team will complete the GRADE assessments for the outcome of interest in these SRs independently and in duplicate. We will assess the imprecision using the same level of contextualization used by the authors of each SR (i.e., minimally contextualized or partially contextualized) as classified by our team using the criteria above. We will use the information provided by the SR authors to assess the risk of bias and inconsistency. We will assume no concerns for indirectness and publication bias unless otherwise stated in the SRs’ results or discussion sections. Using the results of our GRADE assessment, we will then formulate one conclusion per SR as shown in Table 5, using the same level of contextualization utilized by the SR authors (Table 5) [24].

Table 5 Methods for formulating conclusions

After this, our team’s conclusion will be compared to the authors’ conclusions with respect to the certainty and effect size (Table 6). If the review authors’ original conclusion only states that there is insufficient evidence to comment on the outcome or does not comment on the effect size, we will only evaluate whether the conclusion changes with respect to certainty. We will calculate the percentage of conclusions which changed after a GRADE assessment was completed with respect to the classification of the certainty and/or effect size. We will also calculate the percentage of conclusions which have increased certainty, decreased certainty, increased effect size, or decreased effect size after a GRADE assessment is conducted. This will allow us to evaluate how the utilization of GRADE may impact authors’ conclusions by assessing whether conclusions change following GRADE assessments and by determining the direction of that change (i.e., do conclusions become more or less conservative following GRADE assessments?).

Table 6 Methods for determining the effect of the GRADE approach on SR conclusions

留言 (0)

沒有登入
gif