Discordance in Appendicitis Grading and the Association with Outcomes: A Post-Hoc Analysis of an EAST Multicenter Study

ABSTRACTBackground

The American Association for the Surgery of Trauma (AAST) appendicitis severity grading criteria use independent subscales for radiologists (Rad), surgeons (Surg), and pathologists (Path). We reviewed the EAST Multicenter Study of the Treatment of Appendicitis in America: Acute, Perforated, and Gangrenous (MUSTANG) database to determine rates of discordance and clinical consequences of inaccuracy.

Materials and Methods

A confusion matrix was constructed for pairs among Rad, Surg, and Path. Accuracy was reported using chronologically latest diagnosis as gold standard. “Concordance” (C) was achieved when both agreed on the severity grade and “Discordance”(D) when they disagreed. A composite endpoint(“COMP”= 30-d incidence of surgical site infection, abscess, wound complication, Clavien-Dindo complication, secondary intervention, ED[Emergency Department] visit, hospital readmission, and mortality) was compared between C versus D groups via χ2 test with Bonferroni correction to define statistical significance(P = 0.05/9 = 0.005).

Results

For each pair and diagnosis, subjects were categorized as C or D and compared for the incidence of COMP. Incidence of COMP for Surg and/or Path in C versus D: 16% versus. 26% (p = 0.006, NS by Bonferroni) for acute (A), 39% versus 33% (p = 0.39) for gangrenous (G), and 48% versus 37% (p = 0.035, NS by Bonferroni) for perforated (P). For Rad and/or Path in C versus. D: 17% versus 42% (p < 0.001) for A, 27% versus 31% (p = 0.95) for G, and 56% versus 48% (p = 0.48) for P. For C versus D: 17% versus 40% (p < 0.001) for A, 36% versus 26% (p = 0.43) for G, and 51% versus 39% (p = 0.29) for P.

Conclusions

In appendicitis treated by appendectomy, surgeons are most accurate at diagnosing acute appendicitis and least accurate at diagnosing gangrenous. Radiologists are less accurate for all categories. When the surgeon is wrong, clinical outcomes are not significantly worse. However, when the radiologist is wrong about acute appendicitis, patients have worse clinical outcomes.

Key WordsIntroductionAcute appendicitis is a leading cause of acute abdominal operations with approximately 3,00,000 appendectomies performed annually in the United States.Addis D.G Shaffer N Fowler B.S et al.The epidemiology of appendicitis and appendectomy in the United States. Nearly all patients with symptoms suggestive of appendicitis undergo imaging workup to confirm the diagnosis before proceeding to treatment options such as antibiotics alone, percutaneous drainage, or appendectomy.

Yeh D.D, Eid A, Young K et al. Multicenter study of the treatment of appendicitis in America: acute, perforated, and gangrenous (MUSTANG), an EAST Multicenter Study. Ann Surg. 2019;

Of the most common imaging options, computed tomography (CT) has become the preferred modality, as it is ubiquitous, user-independent, and highly sensitive. A recent study reported that the positive predictive value (PPV) for pre-appendectomy CT was as high as 92%.Collins G.B Tan T.J. Gifford J et al.The Accuracy of pre-appendectomy computed tomography with histopathological correlation: a clinical audit, case discussion and evaluation of the literature..When the plan of care is established based on CT findings reported by a radiologist and the surgeon's clinical impression, the intraoperative diagnosis by gross inspection of the appendix is documented, and the specimen is sent to pathology lab for confirmation of the diagnosis by a pathologist. The American Association for the Surgery of Trauma (AAST) severity grading criteria (Table 2)Tominaga G.T Staudenmayer K.L Shafi S et al.The American association for the surgery of trauma grading scale for 16 emergency general surgery conditions: disease-specific criteria characterizing Anatomic Severity Grading. uses subscales for assessment by radiologists, surgeons, and pathologists and has been independently validated as a predictor for clinically important, patient-centered outcomes.Vasileiou G Ray-Zack M Zielinski M et al.Validation of the American association for the surgery of trauma emergency general surgery score for acute appendicitis - an EAST Multicenter Study. However, the severity grade may differ within the same case. For example, a radiologist may grade a case as grade 1, the surgeon may grade it as grade 2, and the pathologist may grade it as grade 3. Disagreements on severity assessment may be due to disease progression (natural history) or may be due to inaccuracy on the part of the radiologist or surgeon. These discrepancies could have serious implications for patient care or, alternatively, may not affect patient outcome at all. We sought to (1) quantify the incidence of concordance and/or discordance between radiologists, surgeons, and pathologists; and (2) investigate if concordance and/or discordance is associated with differences in clinical outcomes.

Table 2AAST Emergency General Surgery Grading System for Acute Appendicitis (J Trauma Acute Care Surg 2016; 81(3):593-602)

AAST = American Association for the Surgery of Trauma; CT = Computed Tomography; RLQ = Right Lower Quadrant

Materials and MethodsThis is a post-hoc analysis of the Multicenter Study of the Treatment of Appendicitis in America: Acute, Perforated, and Gangrenous (MUSTANG) study which collected data prospectively from 28 medical centers in the United States between January 2017 and June 2018.

Yeh D.D, Eid A, Young K et al. Multicenter study of the treatment of appendicitis in America: acute, perforated, and gangrenous (MUSTANG), an EAST Multicenter Study. Ann Surg. 2019;

(Table 1) Local Institutional Review Boards of all 28 enrolling sites approved the MUSTANG study, and the majority waived requirements for informed consents. This study is a post-hoc analysis of the MUSTANG data.

Table 1Demographics and Clinical Characteristics

BMI = Body Mass Index; CCI = Charlson Comorbidity Index; AAST = American Association for the Surgery of Trauma

A confusion matrix was used as a performance measure for classification, whereby the “correctness” of classification was evaluated by computing the number of correctly diagnosed appendicitis grades using the gold standard diagnosis as a reference.A systematic analysis of performance measures for classification tasks. Three matrices were constructed for pairs among radiologists (Rad), surgeons (Surg), and pathologists (Path). Accuracy in diagnosis is reported using the chronologically latest diagnosis as the gold standard (i.e. Surg for the Rad and/or Surg pair and Path for the Surg and/or Path pair). “Concordance” was defined as when the pair agreed upon the diagnosis, and “Discordance” assigned when the pair disagreed upon the diagnosis. Based on the AAST severity grading criteria, findings of grade 1 appendicitis are diagnosed as acute, grade 2 as gangrenous, and grades 3-5 as perforated. Complicated appendicitis is defined as gangrenous or perforated.

Due to the very low incidence of each individual outcome, a composite outcome was created and included 30-d incidence of: surgical site infection, intra-abdominal abscess, wound complication, any Clavien-Dindo complication, any secondary intervention, Emergency Department (ED) visit, hospital readmission, and mortality. “Concordance” and “Discordance” percentages for each pair among the groups were calculated and rates of the composite endpoint were compared using the Bonferroni correction to guard against the increased risk of type 1 statistical errors with multiple comparisons. There were 9 comparison groups, so the adjusted alpha value was calculated to be p = 0.05/9 = 0.005 (Bonferroni correction). Thus, a p-value < .005 was considered statistically significant and all p-values were compared to this threshold. Continuous data are reported as mean ± standard deviation or median (interquartile range) for nonparametric distributions. Comparisons between groups were performed using Pearson's chi-squared test, and 95% confidence intervals (CI) and p-values are reported. The data analysis was performed using R Studio Version 1.1.456.

Results Radiologist and SurgeonA total of 2,795 cases were analyzed for Concordance and Discordance among radiologists and surgeons. Concordance was present in 2,045 cases (81%) of acute appendicitis, 31 case (20%) of gangrenous appendicitis, and 81 cases (68%) of perforated appendicitis (Table 3). Among radiologists and surgeons, the composite outcome occurred in 17% of cases of Concordance and 40% of cases of Discordance in acute appendicitis (p Concordance and 26% of cases of Discordance in gangrenous appendicitis (p = 0.43), and in 51% of cases of Concordance and 39% of cases of Discordance in perforated appendicitis (p = 0.29).

Table 3Comparison of Findings of Acute, Gangrenous, and Perforated Appendicitis Among Surgeons and Radiologists

Rad = Radiologist; Surg = Surgeon

 Surgeon and PathologistA total of 2,857 cases were analyzed for Concordance and Discordance between surgeons and pathologists. Concordance was present in 2,040 cases (94%) of acute appendicitis, 106 cases (42%) of gangrenous appendicitis, and 311 cases (70%) of perforated appendicitis (Table 4). Among surgeons and pathologists, the composite outcome occurred in 16% of cases of Concordance and 26% of cases of Discordance in acute appendicitis (p = 0.006), in 39% of cases of Concordance and 33% of cases of Discordance in gangrenous appendicitis (p = 0.39), and in 48% of cases of Concordance and 37% of cases of Discordance in perforated appendicitis (p = 0.035).

Table 4Comparison of Findings of Acute, Gangrenous, and Perforated Appendicitis Among Surgeons and Pathologists

Surg = Surgeon; Path = Pathologist

 Radiologist and PathologistA total of 2,768 cases were analyzed for Concordance and Discordance among radiologists and pathologists. Concordance was present in 2,041 cases (82%) of acute appendicitis, 22 cases (14%) of gangrenous appendicitis, and 84 cases (65%) of perforated appendicitis (Table 5). Among radiologists and pathologists, the composite outcome occurred in 17% of cases of Concordance and 42% of cases of Discordance in acute appendicitis (p Concordance and 31% of cases of Discordance in gangrenous appendicitis (p = 0.95), and in 56% of cases of Concordance and 48% of cases of Discordance in perforated appendicitis (p = 0.48).

Table 5Comparison of Findings of Acute, Gangrenous, and Perforated Appendicitis Among Radiologists and Pathologists

Rad = Radiologist; Path = Pathologist

Discussion

This study's aim was to investigate the level of agreement between radiologists, surgeons, and pathologists in diagnosing simple and complicated appendicitis and to analyze the association of concordance and/or discordance with clinically important patient outcomes.

 Accuracy of the DiagnosisBased on concordance results among surgeons and pathologists (considered the gold standardJones A.E Phillips A.W Jarvis J.R et al.The value of routine histopathological examination of appendicectomy specimens.) surgeons were most accurate at diagnosing acute appendicitis and least accurate at diagnosing gangrenous appendicitis. Signs of inflammation such as erythema and edema are easily discernible by a surgeon in the operating room, and this may explain the high accuracy in diagnosing acute appendicitis. However, perforation and gangrene may be more challenging to observe in the operating room. In the event of perforation, free fluid, pus, or stool may not always be present to hint at a possible perforation, and the perforation may sometimes be too small to visualize by the naked eye. Similarly, gangrene may be at the early stages of development and can only be visualized at the cellular level under a microscope. Histologically, gangrene starts deep to the serosa; it may be intraluminal, and the serosa may still appear normal.Investigations on discordance between surgeons and pathologists have been previously reported in the literature,Correa J Jimeno J Vallverdu H et al.Correlation between intraoperative surgical diagnosis of complicated acute appendicitis and the pathology report: clinical implications. Bliss D Mckee J Cho D et al.Discordance of the pediatric surgeon's intraoperative assessment of pediatric appendicitis with the pathologist's report. but never before using the standardized and validated AAST grading scale. Our results show higher concordance rates at diagnosing different categories of appendicitis in the adult population than reported in the literature in the pediatric population. One study that included 1,166 pediatric patients with acute appendicitis in a single center reported concordance among surgeons and pathologists in only 48% of the cases.Fallon S.C Kim M.E Hallmark C.A et al.Correlating surgical and pathological diagnoses in pediatric appendicitis. Our results also show higher concordance rates when compared to results of other studies in the adult population. One study that included 342 adult patients with acute appendicitis in a single center reported concordance rates of 80% with differences in rates between female and male patients.Pourhabibi Zarandi N Javidi Parsijani P Bolandparvaz S et al.Accuracy of surgeon's intraoperation diagnosis of acute appendicitis, compared with the histopathology results.Based on concordance results among radiologists and pathologists, radiologists were also most accurate at diagnosing acute appendicitis and least accurate at diagnosing gangrenous appendicitis. Similar to the challenges surgeons face in the operating room, it may be easy for a radiologist to observe thickening of the appendix and enhancement with contrast, but the absence of intraperitoneal fluid and other clear indications for perforation pose a challenge for this diagnosis. Our concordance rates are higher than those reported by Gaskill et al, who reported a 38% rate of concordance between radiologists and pathologists and only 28% concordance between surgeons and pathologists when diagnosing perforated appendicitis at a single center.Gaskill C.E Simianu V.V Carnell J et al.Use of computed tomography to determine perforation in patients with acute appendicitis. There are at least two possible explanations for variation in accuracy of radiologist diagnosis. First, it is possible that a case of simple acute appendicitis progressed to perforated or gangrenous appendicitis in the interval between CT scan and operation due to the natural history of the disease process. Unfortunately, our study design and available data do not allow us to make inferences about this potential mechanism. Second, it is possible that the appendix was perforated or gangrenous at the time of CT scan but was missed or undetected. Factors contributing to a lower accuracy rate may include variations in CT scan resolutions (eg. 16-slice versus 128-slice), institution-specific scanning protocols (ex: abbreviated scan to minimize radiation, rectal contrast, etc.), and experience level of radiologists. Association with Clinical OutcomesDisagreement between a surgeon and a pathologist on the diagnosis of acute appendicitis implies that the surgeon diagnosed simple acute appendicitis when in fact the correct diagnosis was perforated or gangrenous. Because the pathologist's final report is often not available to the surgeon until after the patient has been discharged from the hospital, the patient is clinically treated as acute appendicitis. Therefore, it is reasonable to expect a higher rate of complications in the event of discordance with acute appendicitis, as the patient will have been undertreated for complicated appendicitis. The association between complicated appendicitis, mainly perforated, and increased rates of morbidity and mortality was established in the literature many decades ago, with the incidence of outcomes such as wound infection and prolonged hospital length of stay more likely occur in perforated appendicitis. However, these studies were conducted prior to the era of laparoscopic surgery.Law D Law R Eiseman B et al.The continuing challenge of acute and perforated appendicitis.Hale D.A Molloy M Pearl R.H et al.Appendectomy: a contemporary appraisal. After controlling for multiple comparisons with the Bonferroni correction, we did not observe an increase in complications when the surgeon diagnosed acute appendicitis and the pathologist diagnosed complicated appendicitis. This implies that, in the current era of laparoscopic appendectomy, perforation or gangrene too subtle to be detected by a surgeon's naked eye is closer in clinical post-operative behavior to simple appendicitis than obvious perforated or gangrenous appendicitis.

The composite endpoint was more likely to occur when radiologists and surgeons disagreed on the diagnosis of acute appendicitis, meaning that the radiologist diagnosed simple appendicitis and the surgeon diagnosed complicated appendicitis (gangrenous or perforated). Disagreement between a radiologist and a pathologist on the diagnosis of acute appendicitis occurred when the radiologist diagnosed simple acute appendicitis when in fact the correct diagnosis was perforated or gangrenous appendicitis. This type of discordance was also associated with significantly higher rates of complications. Similarly, the composite outcome was more likely to occur when radiologists disagreed with pathologists on the diagnosis of gangrenous appendicitis. Our findings should be interpreted with caution, as we are unable to discern (as stated above) if the discordance is due to interval disease progression, missed diagnosis, or undetected diagnosis due to scanning technique and interpreter experience. It is possible that the current emphasis on minimizing radiation exposure to the patient trend may result in decreased diagnostic accuracy and worse clinical outcomes for the patient. It is important to note, though, that discordance of acute and gangrenous appendicitis by the radiologist was associated with real clinical consequences for the patients and this topic deserves further study for quality improvement.

 Limitations

There are several limitations to this study. First, this study used the data from a multicenter study with an observational study design. Therefore, the clinical endpoints included in our composite outcome were dependent on clinical encounters for follow-up only, and patients were not contacted by study teams for research follow-up. This may have led to selection bias when comparing the 30-d outcomes included in our composite outcome if patients experiencing the outcome were more likely to follow-up than those not experiencing the outcome. Additionally, the appendicitis Clinical AAST Grade for 386 subjects in the database was missing, so these subjects were excluded from our analyses. Although this group constituted only 11% of the total sample, the risk of bias due to this missing data is theoretically possible. Second, due to the observational design of the original study, we are only able to describe associations without drawing any conclusions on causality. Our findings should be considered hypothesis-generating. Third, because this was a post-hoc analysis, this study is missing some data that would have been helpful in explaining the reasons for discordance. For example, we do not know at what time antibiotics were given in the ED relative to the CT scan diagnosis of appendicitis and the time between the scan and surgical intervention; also lacking are the institution-specific CT scan protocols as well as the degree of scrutiny applied by the surgeon during the operation.

Despite these limitations, the strengths of our study include the use of data from a set that included a large sample size and a diverse population. The data set included patients from 28 different sites around the United States, both urban and rural, academic and community, large and small hospitals. The broad geographic representation improves the generalizability of our findings. This study contributes to the existing literature by documenting the incidence of concordance/discordance in diagnosis using a standardized and validated grading schema and describe the associations of concordance and/or discordance with clinically important, patient-centered outcomes up to 30 days after operation. Ideally, all three groups of assessors would have perfect concordance and zero discordance, but it is unlikely that this can be achieved. Therefore, the next step would be to compare concordance and/or discordance rates between the various enrollment sites to identify outliers with low discordance rates and attempt to emulate their processes in a quality improvement project. This study serves as a starting point for that process, and one implication is that knowledge of the accuracy of diagnosis (particularly of radiologists) may influence treatment decisions. For example, some surgeons may be reluctant to operate on perforated appendicitis, preferring instead initial non-operative management followed by possible interval appendectomy. However, our data shows that in over one-third of cases diagnosed as “perforated” by the radiologist, the surgeon diagnosed acute or gangrenous appendicitis. Additionally, post-operative antibiotics are not commonly prescribed for acute appendicitis but are commonly accepted in practice for perforated appendicitis. Nevertheless, in our study, one in four cases of “perforated” appendicitis as diagnosed by the surgeon was ultimately determined to be simple acute appendicitis by the pathologist suggesting possible antibiotic overtreatment in the post-operative period before final pathology results are known.

Conclusions

In appendicitis treated by appendectomy, surgeons are most accurate at diagnosing acute appendicitis and least accurate at diagnosing gangrenous appendicitis. When surgeons make the wrong diagnosis (compared to final pathology), clinical outcomes up to 30 days after operation are not significantly worse. However, when radiologists are wrong in their diagnosis of acute appendicitis, patients have significantly worse clinical outcomes.

Acknowledgments

We are grateful to the following colleagues from the EAST Appendicitis Research Group for their contribution to data collection in the original EAST MUSTANG Database. Without their work, this article would not have been possible: Baystate Medical Center: Reginald Alouidor and Kailyn Kwong Hing - Beaumont Hospital: Victoria Sharp and Thomas Serena - Boston Medical Center: George Kasotakis and Sean Perez - Carilion Clinic: Stacie L. Allmond and Bruce Long - Cooper University Hospital: Nadine Barth and Janika San Roman - Denver Health: Ryan A. Lawless and Alexis L. Cralley - Emory University: Rondi Gelbard and Crystal Szczepanski - Essentia Health: Steven Eyer and Kaitlyn Proulx - Geisinger Medical Center: Jeffrey Wild and Katelyn A. Young - Inova Fairfax: Erik J. Teicher and Elena Lita - Intermountain Medical Center: David Morris and Laura Juarez - Loma Linda University: Richard D. Catalano and David Turay - Marshfield Clinic: Daniel C. Cullinane and Jennifer C. Roberts - Massachusetts General Hospital: Haytham M.A. Kaafarani and Ahmed I. Eid - Mayo Clinic: Mohamed Ray-Zack and Tala Kana’an - Medical City Plano: Victor Portillo and Morgan Collom - Medical College of Wisconsin: Chris Dodgion and Savo Bou Zein Eddine - North Shore Medical Center: Maryam B. Tabrizi and Ahmed Elsayed Mohammed Elsharkawy - Ryder Trauma Center: D. Dante Yeh and Georgia Vasileiou - Ohio State University Wexner Medical Center: David C. Evans and Daniel E. Vazquez - St. Vincent Hospital Indianapolis: Jonathan Saxe and Lewis Jacobson - Oregon Health Sciences University: Brandon Behrens and Martin Schreiber - University of Arizona, Tucson: Bellal Joseph and Muhammad Zeeshan - University of California, Irvine: Jeffry Nahmias and Beatrice Sun - University of Florida, Jacksonville: Marie Crandall and Jennifer Mull - University of Maryland: Jason D. Pasley and Lindsay O’Meara - University of Southern California: Ali Fuat Kann Gok and Jocelyn To - Walter Reed National Military Medical Center: Carlos Rodriguez and Matthew Bradley.

Author Contributions

K.A.J. and D.D.Y. designed the study. H.Z. performed the analysis. K.A.J. wrote the manuscript. E.U., A.C., S.B., R.R., G.D.P., and N.N. completed critical revision.

Disclosures

The authors report no proprietary or commercial interest in any product mentioned or concept discussed in this article.

ReferencesAddis D.G Shaffer N Fowler B.S et al.

The epidemiology of appendicitis and appendectomy in the United States.

Am J Epidemiol. 132: 910-925

Yeh D.D, Eid A, Young K et al. Multicenter study of the treatment of appendicitis in America: acute, perforated, and gangrenous (MUSTANG), an EAST Multicenter Study. Ann Surg. 2019;

Collins G.B Tan T.J. Gifford J et al.

The Accuracy of pre-appendectomy computed tomography with histopathological correlation: a clinical audit, case discussion and evaluation of the literature.

Emerg Radiol. 21: 589-595Tominaga G.T Staudenmayer K.L Shafi S et al.

The American association for the surgery of trauma grading scale for 16 emergency general surgery conditions: disease-specific criteria characterizing Anatomic Severity Grading.

J Trauma Acute Care Surg. 81: 593-602Vasileiou G Ray-Zack M Zielinski M et al.

Validation of the American association for the surgery of trauma emergency general surgery score for acute appendicitis - an EAST Multicenter Study.

J Trauma Acute Care Surg. 87: 134-139

A systematic analysis of performance measures for classification tasks.

J Inform Process Manag. 45: 427-437Jones A.E Phillips A.W Jarvis J.R et al.

The value of routine histopathological examination of appendicectomy specimens.

BMC Surg. 7: 17Correa J Jimeno J Vallverdu H et al.

Correlation between intraoperative surgical diagnosis of complicated acute appendicitis and the pathology report: clinical implications.

Surg Infect. 16: 41-44Bliss D Mckee J Cho D et al.

Discordance of the pediatric surgeon's intraoperative assessment of pediatric appendicitis with the pathologist's report.

J Pediatr Sur. 45: 1398-1403Fallon S.C Kim M.E Hallmark C.A et al.

Correlating surgical and pathological diagnoses in pediatric appendicitis.

J Pediatric Surg. 50: 638-641Pourhabibi Zarandi N Javidi Parsijani P Bolandparvaz S et al.

Accuracy of surgeon's intraoperation diagnosis of acute appendicitis, compared with the histopathology results.

Bull Emerg Trauma. 2: 15-21Gaskill C.E Simianu V.V Carnell J et al.

Use of computed tomography to determine perforation in patients with acute appendicitis.

Curr Probl Diagn Radiol. 47: 6-9Law D Law R Eiseman B et al.

The continuing challenge of acute and perforated appendicitis.

Am J Surg. 131: 533-535Hale D.A Molloy M Pearl R.H et al.

Appendectomy: a contemporary appraisal.

Ann Surg. 225: 252-261Article InfoPublication History

Published online: May 05, 2021

Accepted: February 27, 2021

Received in revised form: January 5, 2021

Received: August 15, 2020

Footnotes

Abstract of this manuscript was accepted for oral presentation at the 40th SIS Annual Meeting, April 17-20, 2020 in Denver, Colorado (event cancelled due to the Covid-19 pandemic).

Identification

DOI: https://doi.org/10.1016/j.jss.2021.02.048

Copyright

© 2021 Elsevier Inc. All rights reserved.

ScienceDirectAccess this article on ScienceDirect Related Articles

留言 (0)

沒有登入
gif