Reliability of study endpoint adjudication in a pragmatic trial on brain arteriovenous malformations

Of all scientific concepts, reliability or repeatability is the most fundamental. In colloquial terms, reliability is best understood as trust. We are more likely to trust the doctor who, when asked the same question twice (“What is the cause of my problem?”) provides the same answer both times. An instrument or a test should provide the same result when the same object is measured repeatedly. Similarly, a diagnosis made on an imaging study should be repeatable when examined by various specialists on various occasions. It may be because it is so fundamental that reliability is more frequently assumed than tested [1], or because when tested, reliability is often disappointingly low [2], [3].

The results of a clinical trial are given in terms of primary and secondary outcomes that are obtained for each patient. In large trials supported by industry, the adjudication of a clinical outcome or event is typically devolved to an independent, central clinical event or clinical endpoint review committee [4], [5]. Despite standardized definitions and reviews by experts, discrepancies between various judgements can occur, but they are resolved by consensus, hiding the variability and uncertainty behind the final data [5]. Yet, such variability should rather be studied. First, variability should be considered in the interpretation of trial results. Second, in any given field, reliability studies allow the selection or the development of more reliable instruments, in this particular case, more reliable clinical endpoints. This may be particularly important for pragmatic randomized controlled trials (RCTs) integrated into routine clinical practice such as care trials [6]. We sought to determine the reliability of the adjudication of study endpoints when the case report forms (CRFs) of a pragmatic trial were examined.

The Treatment of Brain AVMs Study (TOBAS) is a care trial that includes two RCTs and several prospective registries, including an observation registry. Patients typically follow a non-invasive imaging surveillance plan, along with routine clinical visits or phone calls to determine whether the patient has reached any of the multiple study endpoints, which includes their functional status (modified Rankin scale score) and whether they have experienced any untoward events, including but not limited to rupture or re-rupture of the AVM.

At the end of a trial, when it comes time to evaluate the case report data to determine whether or not a patient has reached a study endpoint, reliability is crucially important. Case report forms are typically filled by research personnel, often rotating, who may have superficial understanding of an uncommon disease such as brain AVMs. Certain endpoints, by their very nature, are more reliable than others: there is usually no debate about whether a patient has died or not. However, a slightly different outcome, such as disease or treatment-related death, leaves room for subjectivity and variations in adjudication. The 7-category mRS scale is a validated tool to assess functional outcome after stroke [7]. In a study with long follow-up, the mRS can be recorded multiple times, and evaluations can be performed by a variety of personnel involved in their care, which can lead to variability in assigned score at various times. Given multiple data entries, how reliable is the endpoint of an increase in mRS of +1 at any time? Other endpoints, such as intracranial hemorrhage, are actually surrogate outcomes for what we care most about: the risk of disability or death from AVM hemorrhage. But what is a hemorrhage? A massive hemorrhage from an AVM is unlikely to be missed. However, small intraventricular hemorrhages in a Spetzler-Martin Grade 5 AVM may be recorded on a different electronic form and only when imaging is performed, and an incidental finding of hemosiderin on MRI may be variably interpreted. Finally, the qualification of “serious” of an adverse event, and its relationship with the AVM or its treatment are also subject to interpretation.

We have recently reviewed the progress of the TOBAS study, including the observation registry [8], [9], [10]. Discrepancies in the process of endpoint determination by 2 members of the adjudication committee were noted and resolved by a consensus session to report results. Because the trial includes a 10-year follow-up period, analyses will likely be repeated in the future, if only to provide interim data to the data safety and monitoring committee at regular intervals. Therefore, we sought to better understand the reliability of the interpretation of the TOBAS endpoint data.

留言 (0)

沒有登入
gif