Responses of Five Different Artificial Intelligence Chatbots to the Top Searched Queries About Erectile Dysfunction: A Comparative Analysis

The results of this study indicated that the AI chatbots’ responses to questions concerning ED did not match the requirements for readability. While Copilot demonstrated satisfactory quality with minor flaws, Bard and Ernie Bot displayed notable quality issues Furthermore, while ChatGPT’s legibility was comparatively worse, Bard was the easiest to understand. To our knowledge, this is the first study to assess, analyze, and compare ED data obtained from different AI chatbots.

Over the years, there has been an increasing trend of interest in ED worldwide. This may be attributed to the increasing incidence of diseases causing ED. The incidence of ED is expected to increase in the future, which will lead to even greater interest in the condition. In this study, the three most frequently searched keywords were “erectile dysfunction cause,” “how to erectile dysfunction,” and “erectile dysfunction treatment.” Many people searched for the causes of ED, and finding the safest and most effective treatment options was the top priority for many men.

Africa showed the highest search interest for ED. Zimbabwe, Zambia, and Ghana ranked as the top three nations with the highest search interest for ED. This suggests that many people from these countries are actively seeking information, including potential solutions, for ED and that there is a need for awareness, education, and accessible treatments for the condition in these nations. Furthermore, Africa is expected to witness the most significant percentage of ED growth, with a predicted rise of 169% between 1995 and 2025 [14]. In a study conducted in Zimbabwe, the prevalence of ED in patients with diabetes was 73.9% [15]. In Zambia, this rate was 56%–68% [16]. Therefore, healthcare professionals in these countries should be knowledgeable regarding the prevalence and risk factors of ED and its treatment options [17].

The quality of health information is pivotal in augmenting the efficacy, cost-effectiveness, and security of healthcare provision. It also enhances patient involvement and contentment [18]. The present study revealed that while ChatGPT, Bing Chat, and Copilot demonstrated acceptable quality with minor flaws, Ernie Bot and Bard exhibited substantial quality issues. Contrary to these findings, Cocci et al. [19] observed that ChatGPT produced low-quality information on urology patients. However, the continuous improvements in AI chatbot systems are certainly accountable for the enhanced quality observed in this study [20]. Nevertheless, it is important to exercise caution when relying on health-related information obtained from Ernie Bot as Copilot has emerged as a vital source for obtaining such information. This study also emphasizes the importance of improving the material produced by AI chatbots. To achieve this, many processes could be adopted, such as facilitating the availability of medical literature and research to enhance the knowledge repository of AI chatbots. This extension could potentially enhance their ability to provide more dependable information on health-related subjects. In addition, including certain parameters tailored to healthcare data during AI model training could significantly improve their capacity to provide contextually relevant and medically accurate responses.

Online health information that is difficult to understand can lead to the dissemination of false information, possibly endangering individual health [21]. The present study revealed that the AI chatbots’ data on ED surpassed the reading level recommended by the National Institute of Health, which is typically appropriate for sixth-grade students. Temel et al. [9] found that the texts produced by ChatGPT on spinal cord injury are challenging to read. In a similar vein, Momenaei et al. [22] observed that the content produced by ChatGPT-4 on surgical treatment of retinal illnesses had elevated levels of readability. According to Önder et al. [23], the information produced by ChatGPT-4 on hypothyroidism during pregnancy would need a minimum of 9 years of education. Our study revealed that ChatGPT requires a high level of education to understand. Although Bard is comparatively easier to understand, it also needs a high education level. These results emphasize the need to ensure that AI chatbots provide precise and readily comprehensible information, particularly concerning andrological health subjects such as ED. AI chatbots with human interventions have the capacity to enhance their own readability levels. Using algorithms combined with human supervision, the produced material can be restructured to conform to specified readability standards.

The popularity of accessing online health information, particularly using technologies such as AI chatbots, is increasing. However, we maintain that in its present form, it is insufficient to substitute the need for a comprehensive medical assessment and consultation with a healthcare professional. Although internet sources can give valuable insights, they lack the individualized and comprehensive evaluation necessary for accurate diagnosis and treatment [24]. Maintaining confidentiality between a doctor and a patient with sexual health issues such as ED is important, and forming this bond is crucial for tailored therapy, which takes into account distinctive aspects that cannot be completely captured by digital contacts alone. In addition, it is essential to consider patients’ social background and their families when providing medical advice. Therefore, although AI chatbots can provide valuable insights regarding ED, including other health subjects, they should be considered only as an additional source of information and not a replacement for expert medical guidance and treatment.

This study has certain limitations. First, the search was restricted to the first 25 terms, thereby compromising the comprehensiveness of the results. By integrating new keywords, a more complete methodology might result in more precise conclusions. Furthermore, broadening the use of non-English keywords might augment the extent of the assessment, resulting in more universally relevant conclusions. Second, this study evaluated the reactions of only five AI chatbots. Given the dynamic nature of this sector and the increasing creation of novel models, future studies including a wider range of AI chatbots that may enhance the precision of the results are warranted.

留言 (0)

沒有登入
gif