Identifying ChatGPT-written Patient Education Materials Using Text Analysis and Readability

  SFX Search  Buy Article Permissions and Reprints Abstract

Objective Artificial intelligence (AI)-based text generators such as Chat Generative Pre-Trained Transformer (ChatGPT) have come into the forefront of modern medicine. Given the similarity between AI-generated and human-composed text, tools need to be developed to quickly differentiate the two. Previous work has shown that simple grammatical analysis can reliably differentiate AI-generated text from human-written text.

Study Design In this study, ChatGPT was used to generate 25 articles related to obstetric topics similar to those made by the American College of Obstetrics and Gynecology (ACOG). All articles were geared towards patient education. These AI-generated articles were then analyzed for their readability and grammar using validated scoring systems and compared to real articles from ACOG.

Results Characteristics of the 25 AI-generated articles included fewer overall characters than original articles (mean 3,066 vs. 7,426; p < 0.0001), a greater average word length (mean 5.3 vs. 4.8; p < 0.0001), and a lower Flesch–Kincaid score (mean 46 vs. 59; p < 0.0001). With this knowledge, a new scoring system was develop to score articles based on their Flesch–Kincaid readability score, number of total characters, and average word length. This novel scoring system was tested on 17 new AI-generated articles related to obstetrics and 7 articles from ACOG, and was able to differentiate between AI-generated articles and human-written articles with a sensitivity of 94.1% and specificity of 100% (Area Under the Curve [AUC] 0.99).

Conclusion As ChatGPT is more widely integrated into medicine, it will be important for health care stakeholders to have tools to separate originally written documents from those generated by AI. While more robust analyses may be required to determine the authenticity of articles written by complex AI technology in the future, simple grammatical analysis can accurately characterize current AI-generated texts with a high degree of sensitivity and specificity.

Key Points

More tools are needed to identify AI-generated text in obstetrics, for both doctors and patients.

Grammatical analysis is quick and easily done.

Grammatical analysis is a feasible and accurate way to identify AI-generated text.

Keywords obstetrics - artificial intelligence - patient education - ChatGPT Publication History

Received: 19 February 2024

Accepted: 24 March 2024

Accepted Manuscript online:
09 April 2024

Article published online:
02 May 2024

© 2024. Thieme. All rights reserved.

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA

留言 (0)

沒有登入
gif