Use of AI Language Engine ChatGPT 4.0 to Write a Scientific Review Article Examining the Intersection of Alzheimer’s Disease and Bone

The main goal of this writing experiment was to evaluate the ability of AI to save time during the process of writing a review article. While all articles related to the Comment were focused on the topic “The Intersection of Alzheimer’s Disease and Bone,” and therefore contained similar information, they were all unique and differed greatly in the process of reaching publishable quality [2,3,4]. A majority of the time spent on the human paper was spent during the literature review and writing phase. This was to be expected, however, as it takes a considerable amount of time to gather resources, read articles, and incorporate large amounts of information together in an appropriate manner. This style of writing also requires the authors to spend a considerable amount of time ensuring the article has proper grammar, word choice, and structure. These factors led to a total of 29.25 h spent in the writing phase of the human-written article. In comparison, the AIA article required 9.17 h to write and the AIO article required a mere 1.7 h. The AIA article required the author to convert resources into a format readable by ChatGPT, input sources one by one, and work through errors encountered while using the software. This increased the writing time of this article with respect to the AIO article, but still did not approach the human article time. The discrepancy in writing time between human and AI articles shows that AI could be a valuable time-saving tool during this phase of preparing a review article, especially for writers who are less proficient in English vocabulary and writing techniques.

Although the writing time was much lower for AI-generated articles, additional total time was added during the “fact checking” process. Through initial trials, our groups were able to verify that ChatGPT tends to misappropriate information to incorrect citations and occasionally fabricates resources. This required a thorough fact check of both the AIO and AIA papers that added similar levels of time to the totals, at 8.35 and 7.08 h, respectively. This must be considered when contemplating the use of AI in writing a review article, as accuracy of information cannot be jeopardized in the pursuit of speed.

It is important to note that the various manuscripts were “written” by authors having different years of experience in writing scientific manuscripts as well as familiarity with the subject knowledge. The first drafts of the human and AIA papers were composed by medical students with little experience in the field, while the first draft of the AIO paper was written by a research faculty member with experience in scientific manuscript writing and familiarity with the subject matter. The experience of the author in scientific writing might have helped reduce the time required for editing and fact checking in the AIO-based model. Ultimately, subject familiarity would qualify authors to be able to judge the quality of work generated by AI: experienced authors would be able to recognize and edit overly ornamental language, inaccurate references, and overreaching statements. Recognizing these deficits would be more difficult for a novice in the field (such as a medical student or other graduate student) and may have contributed to differences in total times between the AIO and other manuscripts. On the flip side, the AIO underwent intense fact checking and rewriting of entire subsections to make a good publishable review. This involved efforts beyond the first author (research faculty), including expert faculty and medical students. It can be inferred that even though the AIO article took less time to generate a publishable manuscript, the process was still labor intensive. The AIA article took less time overall than the human-written article, and the first authors of those manuscripts have very similar levels of experience in the field. The AIA approach also required a good amount of editing by the co-authors, especially content experts. It is important for AI users to keep these factors in mind when writing scientific research or review articles, as the time-saving benefits of AI are likely relative to the author’s capability to use the software efficiently as well as demonstrate a high level of expertise in the scientific area.

When comparing the first and final drafts of the human-written article, there was a much lower percentage of explicitly different text compared to the AI-written articles. In the AIO article, this may be attributed to the fact that AI did not include sufficient information and resources required for a comprehensive review of the topic. As a result, this information had to be manually inserted during the editing phase. It could be expected that this problem would be alleviated by feeding ChatGPT the exact resources that are needed to write a comprehensive review; however, AI does not always include the information that the author desires to use from a specific article. Manual inclusion of new information in this article led to a similar percentage of different information to what was seen in the AIO article. Authors must consider that altering queries can be helpful in obtaining desired information out of AI writing, but with the current versions of the software it is very likely that all drafts will ultimately require human intervention to be of publishable quality.

Nearly half of the AIO references produced for the first draft were incorrect and unusable, requiring thorough fact checking and knowledge of the literature to create an appropriate draft. The AIA paper produced incorrect references at a much lower percentage, likely because ChatGPT was not asked to list authors and dates of references, only to refer to them by their document ID number given by the PDF plugin. Even so, the AIA paper associated document ID numbers with the wrong information on 6 occasions, requiring thorough fact checking for this paper as well. The current AI software is excellent at sifting through endless pages of text to generate ideas and synthesize accurate paragraphs but seems to struggle with the concept of proper citation. It appears this issue can be alleviated through use of an AIA model that provides specific resources to the software, but it is likely that this method increases production time while still requiring a fact check of the software’s writing.

As expected, the plagiarism similarity index of the AIA paper was higher than the human-written paper, at 36% and 10%, respectively. This percentage represents the amount of text in the first draft that was highly similar in wording to internet resources and published articles. Turnitin divides the scoring into five categories, with blue having no matching text, green with 1–24%, yellow 25–49%, orange 50–74%, and red 75–100% matching text. It was discovered through initial queries in the AIA paper that ChatGPT would write with wording very similar to articles that it was asked to analyze, and this finding was confirmed through the similarity score in the yellow range which is highly concerning for plagiarism. Interestingly, the first draft of the AIO paper had a score of 9%, slightly less than the human-written paper. This score falls in the green range and represents a much lower percentage of similar text than the AIA paper, suggesting that prompting AI to write without giving it explicit sources to use results in less plagiarism. Plagiarism similarity scores of the final drafts of the human and AIO papers were nearly identical to their respective first draft scores. This may be attributed to the low starting scores, as both first drafts started well within the green zone and thus did not require editing of highly similar wording. The AIA paper required much more effort to bring the similarity score into the green zone. While AI-plagiarized phrases can and should be edited to become original writing, this is time consuming and can be overlooked if proper plagiarism detection software is not used.

Overall, the results of our writing experiment support the notion that the use of AI can expedite the lengthy process of writing a scientific review article. Users must be extremely thorough in the editing phase, as AI will hallucinate facts and cite resources incorrectly when responding to human queries. While AI is not yet a reliable way to produce scientific literature on its own, it provides a promising avenue to save our precious time.

留言 (0)

沒有登入
gif