The performance of ChatGPT in day surgery and pre-anesthesia risk assessment: a case-control study of 150 simulated patient presentations

It is widely recognized that China has a vast population, yet the nation continues to grapple with significant medical challenges, particularly due to a shortage of anesthesiologists. At present, the number of day surgery procedures is increasing, and therefore, preoperative evaluation is a very important part of anesthesia (Ojo et al. 2010). Currently, ChatGPT’s exploration in the medical field is still focused mainly on medical education and scientific writing, and there is relatively little use of it in clinical and research scenarios (Kung et al. 2023; Shay et al. 2023). One of the key benefits of ChatGPT is its ability to provide instant, accurate, and personalized responses to a wide range of questions related to health care (Cascella et al. 2023; Liu et al. 2023; Odom-Forren 2023). A study by Gupta et al. (2024) searched the database to determine how ChatGPT could be helpful to anesthesia providers, including preoperative management, ICU management, pain management, and palliative care. The results of Gupta’s study showed that ChatGPT can be extremely useful for anesthesiologists, especially for determining the dose of anesthetics, assisting in retrieving research materials, or providing guidance on how to perform certain procedures.

ASA is an important index for preoperative evaluation of both anesthesia and surgical risk and has been used widely, resulting in the index being recognized throughout the world (Riley et al. 2014; Mayhew et al. 2019). Lim et al. (2023) used ChatGPT to evaluate 10 standardized hypothetical patient scenarios and suggested that ChatGPT was able to classify ASA-PS consistently and correctly in multiple simulated patient scripts with appropriate justification and had similar performance to that of human anesthesiologists in the majority of cases. Our current study expanded the size of the patient cohort, broadened the spectrum of diseases under investigation, and showed that ChatGPT had a significant degree of utility for assessing the physical status of patients according to the ASA classification system, with its evaluations aligning largely with those of an expert panel.

This study used ChatGPT to analyze data of the patient’s medical history, examination outcomes, surgical procedures, and anesthesia techniques. Leveraging this data, surgeons and anesthesiologists can acquire suitable risk assessment indicators, thereby saving a significant amount of energy and time. There were no significant differences between ChatGPT and the responses of the experts in the majority of cases. This underscores ChatGPT’s proficiency for evaluating the physical condition of the simulated patients and achieving the correct ASA ratings. This ability of ChatGPT makes it an exceptionally efficient method for preoperative evaluation. As far as we are aware, this is the first study to assess the ability of ChatGPT to make ASA grading and preoperative evaluation of patients, a function that would have major clinical value.

To guarantee that all the anesthesiologists use the same criteria as ChatGPT for considering suitability for day-care surgery, experts need to make their judgments based on the standardized guidelines for day-case surgery 2019 (Bailey et al. 2019). In our current study, ChatGPT and the clinical experts may had different views as to whether patients were eligible for day surgery procedures in some condition. ChatGPT mostly recommended patients for day surgery directly after assessing their physical condition, surgical method, and anesthesia risk. For patients with an ASA ≥ 2, the panel preferred to recommend further examination and treatment before considering the suitability for day surgery. Even for more seriously ill patients, the panel recommended canceling day surgery at a higher rate.

Do these results indicate that ChatGPT is more liberal when considering the risks of anesthesia surgery while the panel is more conservative? We considered the reasons for this difference may be as follows:

1.

Possibly related to the working habits of each expert group, with some groups having a stricter level for indication of day surgery, while others do not. Ansell and his colleges carried out a retrospective case-controlled review of 896 ASA III patients who had undergone day case procedures and concluded that with good pre-assessment and adequate preparation, these patients could be treated safely in the day surgery setting (Ansell et al. 2004). Alternatively, Rasmussen considered that fitness for a procedure should relate to the patient’s functional status as determined by a pre-anesthetic assessment, and not by ASA physical status, age, or body mass index (Rasmussen et al. 2015).

2.

ChatGPT analyzes a patient’s objective indicators with the conclusion made after synthesizing all these indicators, thereby defining the reference value.

3.

The medical staff could not consider which decision was right or wrong, with the actual decision based on either the surgeon or anesthesiologist’s understanding of the guidelines or the patient’s condition.

Although our study showed the benefits of ChatGPT as a tool, there remain problems that need to be considered. Firstly, the correctness and validity of the content must be considered, as incorrect content may mislead the patient. While ChatGPT can provide numerous information and helpful assistance, at present, it cannot completely replace human healthcare workers in all situations. Compared to search engines, it is not possible to find the source of ChatGPT’s information. Taken together, these findings indicate that ChatGPT sometimes answers questions incorrectly, with the information it uses currently not updated since 2021. In addition, ChatGPT cannot access the Internet in real time. To overcome these shortcomings, manual auditing can be used to screen the content generated and allow its accuracy to be judged (Lee et al. 2023).

Secondly, attention should be paid to ethical and privacy issues. During the process of communicating with ChatGPT, patients provide their personal basic information and medical conditions, sometimes including sensitive pictures of their private parts. Although ChatGPT claims that it does not save conversations with users, it needs to be understood that sensitive health information may be damaged or abused during transmission and browsing. It is therefore necessary to implement sound data protection measures, including encryption of sensitive information and secure data transmission. In addition, because ChatGPT has extremely rich emotional value, it is necessary to be careful that anxious patients do not become psychologically dependent on this “friend”. Strict ethical and privacy regulations therefore need to be established to limit the scale of information input and emotional output of ChatGPT.

ChatGPT has the potential to revolutionize the way programs are evaluated for patients by providing accurate and effective clinical help. With any new technology, there are shortcomings that need to be addressed, although the potential benefits of ChatGPT in the field of ambulatory surgical evaluation are enormous. Development is the name of the day, and therefore, healthcare workers need to keep up with this trend and explore this promising area of technology. In this regard, it has been proposed that LLM represented by ChatGPT has the potential to add a new dimension to solving clinical problems. It is also important to realize that ChatGPT is just a machine and cannot replace the humanity and compassion that are so essential to our profession (Odom-Forren 2023).

ChatGPT can therefore be regarded as impartial and potentially offers a sense of reassurance. In a profit-oriented healthcare system, such as that of the USA, it is evident that financial incentives can influence the guidance provided to patients. Therefore, having an independent assessor to oversee medical assessments and decisions would be highly beneficial. As we continue to explore the possibilities of AI in health care, it is important to embrace these new technologies and use them to augment, rather than replace, important clinical work.

While our research offers valuable insights, it is important to acknowledge its limitations. We used simulated patient data, which although closely mirroring real patients inevitably differs in certain aspects. This discrepancy may introduce some degree of error that future studies using real patient data could address. In addition, the limited number of simulated cases restricted the generalizability of our findings. Expanding the case pool would therefore enhance the robustness of our conclusions. Furthermore, our study focused solely on day surgery patients. Further research is needed to assess the applicability of ChatGPT in pre-anesthetic evaluation of major surgery or emergency procedures.

留言 (0)

沒有登入
gif