Pilot study on large language models for risk-of-bias assessments in systematic reviews: A(I) new type of bias?

Risk-of-bias (RoB) assessment is used to assess randomised control trials for systematic errors. Developed by Cochrane, it is considered the gold standard of assessing RoB for studies included within systematic reviews, representing a key part of Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.1 The RoB tool comprises six domains that may signify bias: random sequence generation, allocation concealment, blinding of participants and personnel, attrition bias, reporting bias and other potential biases.2 This assessment is an integral component of evaluating the quality of evidence; however, it is a time-consuming and labour-intensive process.

Large language models (LLMs) are a form of generative artificial intelligence (AI) trained on large volumes of data. ChatGPT is an LLM developed by OpenAI, capable of generating a wide variety of responses in response to user prompts. Concerns exist around the application of such AI tools in research, including ethical, copyright, plagiarism and cybersecurity risks.3 However, LLMs are increasingly popular with investigators seeking to streamline analyses. Studies have begun investigating the potential role of LLMs in the RoB assessment process.4 5 Given the flexibility and rapidly evolving nature of LLMs, our goal was to explore whether ChatGPT could be used to automate the RoB assessment process without sacrificing accuracy. This study offers an assessment of the applicability of LLMs in SRs as of December 2023.

This study sits within an SR (PROSPERO CRD420212479050). Two reviewers (SH and HALL) implemented RoB across n=15 full-length papers in portable document format (PDF) format (table 1). Six domains were assessed independently alongside an added …

留言 (0)

沒有登入
gif