Improved precision oncology question-answering using agentic LLM

Abstract

Despite the widespread application of Large Language Models (LLMs) in biomedical research, their clinical adoption faces significant challenges. These challenges stem from concerns about the quality, accuracy, and comprehensiveness of LLM-generated answers. Most existing work has focused on fine-tuning LLMs based on foundation models, which have not yet fully addressed accuracy and reliability issues. In this work, we propose an agent-based approach that aims to make LLM-based systems clinically deployable for precision oncology, while mitigating common pitfalls such as hallucinations, incoherence, and "lost-in-the-middle" problems. To achieve this, we implemented an agentic architecture, fundamentally shifting an LLM's role from a simple response synthesizer to planner. This agent orchestrates a suite of specialized tools that asynchronously retrieve information from various sources. These tools include curated document vector stores encompassing treatment guidelines, genomic data, clinical trial information, drug data, and breast cancer literature. The LLM then leverages its planning capabilities to synthesize information retrieved by these tools, generating comprehensive and accurate responses. We demonstrate GeneSilico Copilot's effectiveness in the domain of breast cancer, achieving state-of-the-art accuracy. Furthermore, the system showcases success in generating personalized oncotherapy recommendations for real-world cases.

Competing Interest Statement

The authors have declared no competing interest.

Funding Statement

The entire study is funded by GeneSilico, Inc.

Author Declarations

I confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.

Yes

The details of the IRB/oversight body that provided approval or exemption for the research described are given below:

The study used openly available human data sources such as Pubmed, PharmGKB, Clinical Trial databases, DrugBank Open Data, and NCCN/ASCO/ESMO.

I confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.

Yes

I understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).

Yes

I have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.

Yes

Data Availability

This is already mentioned in the Data Availability section of the manuscript.

留言 (0)

沒有登入
gif