Comparison of VTE risk scores in guidelines for VTE diagnosis in nonsurgical hospitalized patients with suspected VTE

The results of the present study revealed that the sequence in the descending order of YI and C-index for the predictive power of VTE diagnosis were the Geneva, Wells, IMPROVE, Padua, YEARS, and PERC scores. No statistical difference with respect to predictive power for VTE diagnosis was found in the pairs of Wells vs Geneva, YEARS vs Padua, YEARS vs IMPROVE, and Padua vs IMPROVE, whereas it presented in the other pairs. In other words, the Geneva and Wells performed best, the PERC performed worst, whereas the others performed intermediately. Revised cutoffs improved the predictive power for VTE diagnosis in the PERC, Padua, and IMPROVE scores. Of note, the absolute predictive performance of all these isolated scores were poor.

The prevalence of VTE in the current cohort was 13.7%, that is basically consistent with previous studies, in which overall VTE event rates in hospitalized medical patients ranged from 10 to 15% [26]. Accordingly, the degree of VTE risk in this study population is representative of nonsurgical hospitalized patients. Most of the items in all these scores were correlated with VTE occurrence in the current patient population, once again validating their eligibility in those scores. The comparison among more than two kinds of these six scores have been rare to date yet. No identical previous studies are available for reference except for some studies analogical to the present one. A recent systematic review compared the capacity of ruling out PE among the Wells, Geneva, YEARS, and PERC scores across different healthcare settings. In the hospitalized healthcare setting, the Wells plus PTP-adjusted D-dimer(sensitivity 95.64%, specificity 39.50%), the Geneva plus PTP-adjusted D-dimer(sensitivity 95.73%, specificity 37.29%), and the YEARS plus PTP-adjusted D-dimer(sensitivity 96.94%, specificity 35.83%) yielded similar diagnostic accuracy [27]. It was basically consistent with the present results, except that the YEARS was inferior to the other two in the current study. Since the aforementioned systematic review incorporated PTP-adjusted D-dimer especially for the YEARS, plus it only targeted PE, it is not appropriate to be regarded as an eligible reference.

Among these six scores, comparison of Wells versus Geneva, and Padua versus IMPROVE were performed most frequently. The results of the comparison between Wells and Geneva were mixed among the related studies. Among them, the results of some previous studies supported the perspective that the Wells and Geneva score had similar prediction accuracy for patients with suspected PE [28,29,30,31,32], whereas the results of some other studies were in favor of that the Wells score was more accurate than the Geneva score [33,34,35,36,37,38]. In the present study, the predictive power for VTE diagnosis was alike between the Wells and Geneva, albeit the Geneva score seemed slightly better than the Wells score without statistical difference. With respect to Padua versus IMPROVE, several previous studies involving the comparison of them suggested that the predictive power for VTE diagnosis were equally matched between the two scores [39,40,41]. The results of the present study were consistent with those of previous studies, albeit the IMPROVE seemed slightly better than Padua without statistical significance.

The correlation between predisposing factors or typical indications of VTE in VTE risk scores and VTE occurrence affect their predictive power for VTE diagnosis. The stronger the correlation, the better the predictive power for VTE diagnosis. According to the Table 1, it can be observed that the sum of presence frequency of VTE risk elements in descending order are 30, 29, 28, 26, 26, and 11 times for the Geneva, PERC, Wells, Padua, IMPROVE, and YEARS, respectively. The presence frequency per element in descending order are 4.29, 4.00, 3.71, 3.67, 3.63, and 2.60 for the Geneva, Wells, IMPROVE, YEARS, PERC, and Padua, respectively. The sum of presence frequency of VTE risk elements and the presence frequency per element in authoritative VTE scores especially the latter can embody the relevancy and acceptance degree of these risk elements in VTE risk assessment. According to the Fig. 1, the VTE risk elements which present for at least three times or more are recent immobilization, trauma or surgery(5 times), previous VTE history(5 times), DVT symptoms and/or signs(5 times), hymoptysis(4 times), active cancer(4 times), age (4 times), and heart rate or pulse(3 times).

The determination of cutoffs for risk classification in VTE scores also has an impact on their predictive power for VTE diagnosis. For most VTE scores, the higher the cutoffs, the higher the specificity, the lower the sensitivity, and vice versa. The more appropriate the cutoffs, the better the predictive power for VTE diagnosis. A balance point needs to be quested between missed diagnoses and excessive examinations. Of note, different patient populations with different clinical VTE probability may require different cutoffs. The ratio of VTE-likely cutoffs to total points in descending order are 0.88, 0.33, 0.33, 0.29, 0.20, and 0.17 for the PERC, Geneva, YEARS, Wells, Padua, and IMPROVE RAMs, respectively. Since the PERC score is distinctive among all six scores by reason of that all the items it contains are negative risk factors for VTE occurrence whereas the other five scores all have positive ones for VTE occurrence, its ratio of VTE-likely cutoffs to total points should have been 0.12 which is actually the least in all six scores instead of 0.88, if its items had been set up to be positive risk factors for VTE occurrence.

Ever since the Wells and Geneva score emerged, their role in the PTP prediction of PE have been externally validated in a series of previous studies [2, 6, 9, 27, 37]. The Geneva and Wells have the most(30 times) and third most(28 times) presence frequency of VTE risk elements, as well as highest(4.29) and second highest(4.00) presence frequency per element among all these six scores, respectively. The Geneva and Wells scores both contain the elements of recent immobilization, trauma or surgery(5 times), previous VTE history(5 times), DVT symptoms and/or signs(5 times), hymoptysis(4 times), active cancer(4 times), and heart rate or pulse(3 times), except the Wells score has the element of alternative diagnosis less likely than PE(2 times), whereas the revised Geneva has that of age (4 times). In other words, the Geneva and Wells score especially the former have the most highly-acknowledged risk elements for VTE diagnosis among all six scores. The universally-accepted VTE risk factors in scores which represent most highly-correlated predictors of VTE occurrence could conduce to improve their predictive accuracy for VTE diagnosis. Meanwhile, it can be found that the Wells and Geneva scores are highly similar with each other in composition, of which six elements(26 times) of the total seven ones are identical with each other. This may be accountable for their similar predictive performance in VTE diagnosis. In addition, ROC analyses justified the rationality of their cutoffs. Notwithstanding all this, howsoever, caveat is necessary that the Wells score incorporates a subjective criterion “alternative diagnosis less likely than PE” which is dependent on the experience of clinicians, and is intractable to be standardly operated or imparted, being different from the Geneva.

The Padua and IMPROVE scores are two authoritative ones acknowledged by leading guidelines for medical patients, and have been sufficiently validated in previous external studies [17, 18, 21]. A closer observation at the composition of Padua and IMPROVE revealed that they have the same(26 times) presence frequency of VTE risk elements, whereas the presence frequency per element of the IMPROVE (3.71) is higher than that of the Padua(2.60). These two scores both contain the elements of previous VTE history(5 times), recent immobilization, trauma or surgery(5 times), age(4 times), active cancer(4 times), and thrombophilia(2 times). Their discrepancy in composition is that the Padua score incorporates the elements of ongoing hormonal therapy(2 times), acute infection and/or rheumatologic disorder(1 time), acute myocardial infarction and/or ischemic stroke(1 time), body mass index(1time), and heart and/or respiratory failure(1 time), whereas the IMPROVE incorporates elements of DVT symptoms and signs(5 times) and ICU/CCU stay(1 time). Taken together, the majority of elements(20 of the total 26 times) which are highly-acknowledged risk factors of VTE occurrence are identical between the Padua and IMPROVE. Their similar performance may be attributable to such structural similarity, albeit the IMPROVE seemed slightly better than the Padua without statistical significance.

Overall, the Geneva and Wells generally outperformed the IMPROVE and Padua with respect to the predictive power for VTE diagnosis. These four scores merely share three VTE risk elements which are previous VTE history(5 times), recent immobilization, trauma or surgery(5 times), and active cancer(4 times), whereas had a large proportion of elements not in common. By comparison, the Geneva and Wells both have modifiable risk factors of VTE occurrence like hemoptysis and heart rate or pulse that can reflect the point-of-care status quo of patients, whereas the IMPROVE and Padua do not incorporate these elements. Lack of such elements may abate their predictive power for VTE diagnosis. Of note, notwithstanding these four RAMs all reflect VTE risk, the IMPROVE and Padua were endorsed by the guidelines in terms of VTE prevention or thromboprophylaxis [17, 18], whereas the Geneva and Wells were endorsed in the guidelines of diagnosis and management of PE [2, 9, 10, 15]. The results of present study justified that the IMPROVE and Padua were inferior to the Geneva and Wells with respect to predictive power for VTE diagnosis. Nonetheless, revised cutoffs could improve their performance in certain degree.

The YEARS score is a condensed derivative of the Wells score. Generally, the YEARS algorithm denotes the application of YEARS score in association with a D-dimer level instead of the isolated score alone [2, 24]. Of note, the YEARS in the current study was an isolated score rather than an algorithm since the current study was intended to compare the isolated VTE risk scores without D-dimer. As such, the current results are not applicable to the YEARS algorithm. The YEARS score has only three elements which are DVT symptoms and/or signs(5 times), hemoptysis(4 times) and alternative diagnosis less likely than PE(2 times). Its presence frequency per element is 3.67 which is merely less than those of the Geneva and Wells despite its presence frequency sum of VTE risk elements is only 11. In a retrospective study which compared the predictive accuracy for PE occurrence between the YEARS algorithm(RAM + D-dimer) and the Wells score, the YEARS algorithm was more sensitive than the Wells score (97.44% vs 74.36%), whereas was less specific than the latter(13.97% vs 33.94%). Besides, the YEARS algorithm yielded better negative predictive value than the Wells score (98.0%vs 92.4%). Nevertheless, it was the YEARS algorithm that was employed instead of the isolated YEARS score alone in the study [42]. Accordingly, the study is not an ideal parallel to the current one. In the present study, the diagnostic performance of the isolated YEARS was outperformed by that of the Geneva and Wells, probably due to its excessively simplistic structure, albeit being similar to that of the IMPROVE and Padua. Nevertheless, its cutoff was justified to be appropriate. Of note, the YEARS also has the subjective element which is the “alternative diagnosis less likely than PE”.

The PERC score was originally developed for the PE exclusion among patients with a low clinical probability of PE and has been validated in a randomized controlled trial [43]. It has high sensitivity but low specificity for PE occurrence among patients with intermediate or high clinical probability of PE [2, 44]. Likewise, its predictive power for VTE diagnosis was the worst among all these six scores in the present study in which the subjects were hospitalized patients who carried considerable probability of VTE occurrence, albeit its NPV, FNR, NLR, and DOR were satisfactory yet. Among all these scores, although the presence frequency sum of VTE risk elements in the PERC is 29 times which is only less than that in the Geneva(30 times), whereas its presence frequency per element is 3.63 which is the second least one of among all scores. More importantly, the original cutoff of the PERC that resulted from the patient population with a low clinical probability of PE resulted in its poor predictive power in the current patient population. With the original cutoff of the PERC, substantial excessive unnecessary imaging examinations yielded despite missed diagnoses were drastically avoided, whereas a revised cutoff could improve its performance.

Several limitations need to be acknowledged for the current study. First of all, prospective studies are warranted since the current one was a retrospective review. Secondly, since the current subjects were nonsurgical hospitalized patients, the results may not be applicable to surgical ones, and/or ambulatory outpatients. Besides, generally all nonsurgical hospitalized patients should be included in the evaluation by clinical VTE risk scores. However, only nonsurgical hospitalized patients with suspected VTE were included in this study. Therefore, the results may not be applicable to general nonsurgical hospitalized patients. Thirdly, the Wells and Geneva scores adopted for the present study were simplified version instead of original version, the results might have been different if their original versions had been employed. Likewise, the Wells DVT RAM [6, 16] was not incorporated in the current study either. Last but not least, D-dimer was not involved since the intention of the current study was to compare VTE risk scores. It is worth noting that the absolute performance of each isolated score per se was unsatisfactory(C-index < 0.7 for all), being basically consistent with the results of previous studies [45]. Accordingly, a combination of risk scores and D-dimer is highly recommended by guidelines at present [2]. The results might have been different if D-dimer had been involved.

In conclusion, the comparison of predictive power for VTE diagnosis among six VTE risk scores in guidelines indicates that the Geneva and Wells scores perform best, the PERC score performs worst, whereas the others perform intermediately, in nonsurgical hospitalized patients with suspected VTE. Little difference presents between the Geneva and Wells scores, as well as among the IMPROVE, Padua, and YEARS scores. Revised cutoffs improve the performance of the PERC, Padua, and IMPROVE scores. Nevertheless, the absolute performance of all isolated scores are mediocre. The results may assist clinicians with the selection of relevant scores in the corresponding clinical settings.

留言 (0)

沒有登入
gif