The Next Horizon of Drug Development: External Control Arms and Innovative Tools to Enrich Clinical Trial Data

Definitions, Categories, and Construction

The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) Guideline E10 defines an externally controlled trial as “one in which the control group consists of subjects who are not part of the same randomized study as the group receiving the investigational agent, i.e., there is no concurrently randomized control group” [17]. While external controls are utilized as stand-alone comparators, a trial may also enroll subjects in a concurrent control arm and augment it using an external control arm (hybrid control arms) [18, 19]. Recently, the U.S. Food & Drug Administration has issued a draft guidance on externally controlled trials [6].

ECAs may be categorized as concurrent ECAs and historical ECAs: concurrent ECAs use subject data collected during the same time-periods as the subjects receiving the investigational agents, while historical ECAs use subject data collected at an earlier time [17]. An ECA is constructed by choosing the control arm’s subjects for comparison with current experimentally treated subjects and usually comprises the following steps [7]:

1.

Identification of an external data source that provides a similar population as studied in the clinical trial.

2.

Data entry, reading, and processing to generate an analysis-ready data file.

3.

Statistical selection or adjustments using subject-level data to balance baseline covariates between arms.

Data Sources

While data for concurrent ECAs typically come from concurrent observational studies or concurrent patient registries, historical ECAs may use data from historical CTs or real-world data (RWD) such as historical observational studies, patient registries, electronic health records, insurance claims, or publications.

Historical CT data, especially those from large and well-conducted RCTs for the same disease and similar patient populations, are more suitable when the RCTs follow ICH guidelines. Such data usually are more accurate and complete than most RWD, generally have baseline demographic and clinical characteristics variables similar to those for the target clinical trial and are more likely to use similar definitions for disease, patient inclusion and exclusion criteria, and outcome measures. Historical CT data are often used when pooled RCT data are available. Examples of sources for the pooled subject-level clinical trial data include the Historical Trial Data (HTD) Sharing Initiative [20] and Medidata Enterprise Data Store (MEDS) [5]. HTD Sharing Initiative was established to share de-identified data to maximize the value of clinical data collected historically in the control arms of clinical trials [20]. The MEDS has amassed a pool of more than six million anonymized subjects from nearly 20,000 previous clinical trials [4]. Other sources include Project Data Sphere (PDS), which collects, curates, and aggregates clinical trial data on its open-access platform allowing researchers to develop external control arms from subject-level data [21]. The Yale University Open Data Access (YODA) Project is another source of open-source data access [22].

RWD may also be leveraged to create external controls [9, 13, 23,24,25,26]. RWD is a particularly useful data source when historical RCT data are unavailable or unsuitable for comparison, e.g., in rare diseases where there is often a paucity of prior CTs due to a lack of available treatments and insufficient sample sizes in patient enrollment. The frameworks for RWD and real-world evidence (RWE) have been developed by the United States (US) Food and Drug Administration (FDA), the EMA of the European Union (EU), and Japan’s Pharmaceuticals and Medical Devices Agency (PMDA). These regulatory agencies support the various uses of RWD for regulatory purposes [27,28,29,30]. Because the collection of RWD often does not follow ICH guidelines or clinical practices, researchers need to closely examine the validity, reliability, and relevancy of the data when using them to create ECAs.

Irrespective of the data source, the quality of the ECA depends on the comparability of treatment approaches, completeness of patient attributes captured, and the robustness of endpoint assessment to ensure good matching using methods such as propensity scores and comparability with experimental trial subjects.

Applications

ECAs are particularly useful where implementing an RCT may not be feasible or ethical. These may include, for example, testing an investigational drug for a rare disease with no alternative treatment or established standard of care, or subjects are very difficult to find for a disease with high unmet need (e.g., a fast progression cancer with increased mortality, or for a vulnerable population such as pediatric subjects.) [23,24,25]. For indications where CTs are often operated as single-arm trials where all participating subjects are assigned the investigational drug [8], ECAs may provide the data needed to assess the efficacy and safety of the investigational intervention as seen for example in Celsion’ OVATION trials using Medidata Synthetic Control Arm® [31].

For RCTs testing drugs for conditions with an inadequate standard of care, hybrid ECA designs have been suggested to augment in-trial control arms. In this approach, multiple subjects are included in the external control arm for each subject in the control arm, i.e., at a k:1 ratio. This hybrid approach allows more subjects to be randomized to the investigational drug while preserving some randomizations [4, 19, 32].

Benefits

The use of ECAs allows the entire or a larger proportion of the participants of a CT to be assigned to the experimental treatment arm, which significantly boosts patient welfare when the novel treatment is hypothesized to have better safety or efficacy compared to the standard of care. This is particularly important when no current treatment exists. This advantage also obviates the quandary when subjects do not want to be assigned to a standard of care that they may perceive inadequate. Not only ECAs allow a larger proportion of patients to be assigned to the investigational arm, but it also ensures that the quality of evidence generated by CTs in diseases with small and/or hard-to-recruit populations is high and helps enhance the inclusion of such populations.

Improved trial efficiency could allow RCTs to complete faster, enabling drugs to get approval and market faster (if the ECA methods used and trial results are accepted by regulators)—thereby benefitting subjects not enrolled in CTs who might otherwise have inferior (or no) treatment options. Besides shortening the time for new drug approval and time to market, improving trial efficiency also helps reduce the cost to sponsors for new drug development by reducing the number of subjects needed for the CTs required for the drug’s approval. There are several benefits in terms of metrics [33].

In addition, it has been suggested that ECAs may also provide sponsors and regulators in the future with the evidence needed to support expedited conditional approval or with an additional source of evidence to translate conditional approvals to full approvals or approve additional indications (label expansion), increasing the pool of subjects who can benefit from the therapy [24]. ECA may also allow for the comparison of the investigational drug against a broader set of comparators and patient types.

In cases where the comparator arm may have otherwise been compromised (e.g., due to lower adherence or higher dropout rates if the comparator treatment becomes less effective due to evolutions in clinical practice over the course of the CT [34]), a carefully selected ECA cohort can still help estimate the treatment effect with a high degree of accuracy. ECAs can also help when subjects may be reluctant to enroll if the comparator or reference product has been superseded in clinical practice or there is a perceived risk–benefit tradeoff with older products (e.g., nocebo effect) [35].

Challenges and Potential Biases

The biggest challenge is to find relevant and high-quality data for ECAs, as discussed above. As more and more sponsors have contributed their historical RCT data to the pooled CT databases such as the Historical Trial Data (HTD) Sharing Initiative and the Medidata Enterprise Data Store (MEDS) mentioned above, and as more and more RWD become available, the shortage of relevant and high-quality data for ECAs can be gradually eased.

Another major challenge is the potential confounder and biases, especially for RWD-based ECAs, which can make it difficult to estimate with confidence the efficacy and safety profile of the investigational therapy [36, 37]. A confounder is a variable correlated with both the outcome and the intervention without being an intermediate cause in the causal pathway between intervention and outcome. It is essential to find and use data with a sufficiently large number of covariates/baseline variables to identify the potential confounders and minimize the potential biases [36,37,38,39].

Without the needed variables, no statistical methods may be able to comprehensively correct for all potential confounding factors that have been identified by other researchers in other studies. When the appropriate data are available, advanced statistical methods may be used to reduce or remedy the potential biases caused by those confounding factors. These methods are discussed in more detail in the sections below.

As with other external controls, the nature and quality of the underlying external data are critical for the rigor and validity of ECAs. Thus, several biases may affect these data sources, and statistical methods may be considered to mitigate their effects.

One of the main reasons regulatory agencies favor randomization in CTs, i.e., randomized controlled trials (RCTs), is to clearly establish a potential causal link between a therapy and the observed outcome [40,41,42,43,44]. These approaches can account for effects of treatment intent, time-varying treatment, and confounding for multiple treatment effects [45]. RCT emulations may also be conducted [16, 46, 47], but due to a lack of randomization. However, there are potential biases for consideration when building an ECA (Table 1).

Table 1 Sources of Biases [6, 74, 117, 120]Matching Methods

Advanced methods (e.g., propensity score matching [PSM]) are increasingly applied to ensure that the subjects in the current trial and historical benchmarks are as similar as possible. Reducing the differences between the patient characteristics in an experimental arm and an ECA can be achieved through matching methods, which also address sources of confounding and selection bias. Confounding was discussed earlier, while selection bias is best described as a “fundamental difference between the patients included in the treatment arms of a study due to the way in which patients were allocated to the treatment groups” [48].

Some recent case examples of ECAs that have employed PSM include those from the Friends of Cancer Research working group in both Lung Cancer and Multiple Myeloma [8, 49]. PSM of pooled subject-level historical trial data was used to replicate results from the control groups of prior CTs with a high degree of similarity to the original outcomes. Additionally, regulatory guidance documents suggest that reducing selection bias starts with a priori selection of the external control group before conducting any comparative analyses and suggests documenting the analytic approaches in a pre-specified protocol and statistical analysis plan [26].

Bayesian Methods

Bayesian approaches have been applied to CTs for adaptive data borrowing, including power priors, commensurate priors, meta-analytic predictive priors, and robust mixture priors [50,51,52,53]. For example, the Bayesian case example repository, supported by the Drug Information Association’s Bayesian Scientific Working Group, contains a series of case studies demonstrating examples of the use and value of Bayesian statistics in medical product development [54]. In particular, it can be useful for pediatric trial designs [55]. Additionally, the FDA recognizes and provides guidance on Bayesian adaptive designs [56, 57]. However, it is worth noting that the FDA cautions about using adaptive designs with smaller sample sizes, as they may fail to provide outcomes on subpopulations with insufficient statistical power [56]. This is particularly pertinent for hybrid study designs with small samples, where historical information can be used to inform prior distribution, increasing the statistical power for future (i.e., posterior) conclusions [58]. These approaches are readily applicable to external controls [59].

Timing of Trials

It is essential to account for the differences in timing to capture study observations between a CT and external control. This pertains to mitigating sources of ascertainment bias. Here, ascertainment bias is “the systematic distortion of the assessment of outcome measures by researchers or study participants” [60].

Using a historical CT for ECAs mitigates much of this concern in ascertainment bias, as external data are likely to be from a similar setting of control and scrutiny. However, care should still be taken to review trial protocols and assess the similarity of periodicity and rigor of assessment in trial data included [8]. Careful selection of matching variables and matching approaches should be used. Another method for identifying and adjusting for ascertainment bias is using positive and negative controls, where positive controls are the variables known to impact the outcomes of interest and negative controls are variables that are known not to causally affect the outcome [61].

In a study by Desai et al., an association of diabetes with both hereditary fructose intolerance and Alpha-1 Antitrypsin deficiency, two rare diseases, was assessed across multiple data sources [62]. Positive and negative controls were used to calibrate the strength of association to account for possibly higher levels of examination and intervention in diagnosed rare disease subjects. A similar approach was used by Schuemie et al. in RWD to compare associations with dabigatran, warfarin, and gastrointestinal bleeding, as well as those of selective serotonin reuptake inhibitors and upper gastrointestinal bleeding [63]. Both examples used positive and negative controls to calibrate confidence intervals to determine the statistical significance of observed effect sizes. The discrepancies between two conflicting RWE studies were explained [64]. Addressing the treatment adherence/compliance bias requires active awareness of this issue and ensuring adequate insight into the data to assess it. Consideration of screen failure rates and discontinuation rates is required for historical RCTs. For RWD, sufficient capture of diagnosis, healthcare encounters, procedures, treatment administration, and prescription fill or refills, etc., as pertinent to the question at hand is required.

Subject Level Meta-Analysis

Pooling of historical CT data and/or RWD also enables various other applications, including target selection for new mechanisms, trial design and optimization, trial recruitment, health technology assessment, and market access approval, and post-approval applications for the verification of effectiveness and life cycle management, label expansion, and drug repurposing [65].

A meta-analysis should be conducted to estimate the treatment effect associated with the intervention and to understand the uncertainty around the effect. Traditional meta-analyses use aggregate results from multiple CTs based on data available in publications or on an individual patient level. In clinical development, they often serve as a starting point for effect size estimates in trial planning, aiding in comparator selection and power calculations. Meta-analyses can be of aggregated data reported in the studies or of the individual subject-level data. Data can then be systematically pooled (e.g., random-effects model or fixed-effects model [66, 67]), affording a greater sample size than can be achieved. Within pooled data, inclusion and exclusion criteria can be matched towards a potential new CT. Multi-arm (e.g., indications and dosages) trial cohorts can be stratified as needed. The timing of outcome assessments measured can be aligned for consistency. For differences in composite endpoint calculation, individual outcome elements, if available, can be used to standardize outcome assessments across trials.

Lifecycle Management

An important component of extending the value of therapeutics is lifecycle management (LCM). This can include maintaining market approval, as well as enhancement of value through indication expansion, reformulation, or repurposing [68]. Maintaining market approval is an issue that has arisen in Europe, the Middle East, and Africa for long-approved off-patent products in disease areas where newer (potentially more efficacious/effective) treatments are available, and standard of care has evolved. In such cases, the regulators may seek the assurance of continued therapeutic benefit as part of market re-authorization, and the lifecycle stage may not be conducive to conducting Phase IV trials. For these cases, Pooled CT data for external comparators have several advantages over RWD alone. When paired with subject-level meta-analytic or ECA approaches, this provides the ability to compare evolving performance benchmarks over time, although such comparisons may be limited or infeasible if diagnostic criteria or endpoint preferences have significantly changed. With the advent of interchangeable biosimilars (a biosimilar product that may be substituted without the intervention of the healthcare professional who prescribed the reference product, much like a generic drug for a branded drug) [69, 70] in the US, the above approach also has potential applications for supporting future biosimilar approvals [71]. Indication expansion and drug repurposing efforts may similarly benefit from an external benchmark, ECA, or hybrid approaches. As the drugs in question have already met efficacy and safety hurdles, there is already a precedent for the supportive use of RWE in this application through existing and ongoing RWD and RWE. The FDA approval of palbociclib for male breast cancer, which was expanded from female breast cancer, included supportive EHR data (see, e.g., [72]). When paired with ECA or hybrid approaches, accelerated drug development and approval may possibly be achieved.

Simulated Data

A promising and emerging approach for working with subject-level data in a secure manner is to employ simulated data. Simulated subject-level data can be created from existing data to preserve patient anonymity and prevent accidental or potential identification of subjects. This can be applied to either RWD or CT data. Simulated data preserve the relationships that exist in source data, but they alter the identifying information about each of the subjects that make up the cohort. Unlike individual-level meta-analyses, simulated data may more easily be shared without patient-specific information.

By allowing the use of historical CT data while preserving patient anonymity, a full anonymization or de-identification approach increases the flexibility in leveraging these databases. While the terms anonymization and de-identification may be considered synonymous terms, there are some subtle different meanings between them and regulatory preferences of the two terms [73,74,75].

According to EDUCAUSE (https://www.educause.edu/), anonymization is “the act of permanently and completely removing personal identifiers from data, such as converting personally identifiable information into aggregated data. Anonymized data is data that can no longer be associated with an individual in any manner.” In comparison, “de-identification involves the removal of personally identifying information in order to protect personal privacy. In some definitions, de-identified data may not necessarily be anonymized data. This may mean that the personally identifying information may be able to be re-associated with the data at a later time” [76]. Europe’s General Data Protection Regulation (GDPR) tends to use the term anonymization and defines anonymous information as the “information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.” (GDPR Recital 26) [77]. In comparison, the US regulations tend to use the term de-identification. For example, the Health Insurance Portability and Accountability Act (HIPAA) defines de-identification as the process by which identifiers are removed from the health information following the de-identification standard and implementation specifications in HIPPA §164.514(a)-(b), and the de-identified health information as the “health information that does not identify an individual and with respect to which there is no reasonable basis to believe that the information can be used to identify an individual.” (HIPPA §164.514) [

留言 (0)

沒有登入
gif