Accuracy and consistency of chatbots versus clinicians for answering pediatric dentistry questions: A pilot study

Artificial Intelligence (AI), essentially a diverse collection of technologies that simulate human intelligence, has ushered in significant changes in various sectors, including healthcare [1]. Machine learning, a subset of AI, has been used to identify underlying statistical relationships within data, enabling it to make predictions about new, unobserved data. Deep learning, a subtype of machine learning, employs a series of layered mathematical computations to analyze and draw inferences from intricate data types, such as images [1]. AI has been employed in different medical fields, including diagnostics, prediction-making, and treatment planning [2]. In dentistry, deep learning has applications in oral lesion detection, landmark identification, charting, and implant classification [3,4]. In pediatric dentistry, AI, deep learning, and machine learning are used for early childhood caries detection, age estimation, tooth charting, treatment planning [5], [6], [7], [8], and predicting future disease occurrence or progression [9,10].

A common application of AI is chatbots, which use text or voice to simulate a human-like conversation. Chatbots can understand, interpret, and respond to human questions using natural language processing (NLP), a subbranch of AI. Modern NLP approaches build on Large Language Models (LLMs), trained on hundreds of terabytes of textual data and describing the statistical distribution of words, graphemes, characters, and punctuations in the vast corpus of publicly available human-generated text [11]. In 2017, Google announced its "Transformer" machine learning architecture specialized for NLP, which is now highly popular for language processing. Currently, different chatbots using LLMs are available, such as Bidirectional Encoder Representations from Transformers (BERT, Google, USA) [12], Generative Pretrained Transformer-3 (GPT-3, OpenAI, USA) [13], Pathways Language Model (PaLM, Google, USA) [14], Language Model Meta AI (LLaMA, Meta and Microsoft, USA) [15], and the more recent GPT-4 (OpenAI) [16,17]. These chatbots understand and generate conversational language based on a large dataset of conversational text used for training [18] and predict subsequent elements from previous natural-language texts. Chatbots are known for their inherent limitations, including the risk of plausible-sounding but incorrect answers [19,20]. To generate human-like responses, they need to understand the context and intent of the conversation and then predict the most likely next response or sequence of responses when prompted [21].

In medicine, an obvious use case of chatbots is to answer targeted questions for knowledge management. Chatbots demonstrated moderately accurate answers to domain-specific questions in various medical fields, such as otolaryngology, ophthalmology, and urology [22,23]. In dentistry, there is limited exploration into the application of chatbots. Dental chatbots could be used to schedule appointments, educate patients, and identify dental problems based on symptom assessment or in dental education [24,25]. So far, a limited number of studies evaluated the performance of chatbots in answering dental questions, for example, in endodontics [19,20,[26], [27], [28], [29]]. Their application in pediatric dentistry is underexplored.

This study aimed to assess the accuracy of chatbots versus clinicians in answering pediatric dentistry questions. Specifically, we evaluated and compared the accuracy and consistency of responses from AI chatbots versus dental professionals (general dentists, students, pediatric dentists) when answering such questions. Our null hypothesis was that there would be no significant accuracy differences between chatbots and clinicians. This study provides much-needed insight into the capabilities of chatbots in pediatric dentistry relative to dental experts.

留言 (0)

沒有登入
gif