Application of Bayesian approaches in drug development: starting a virtuous cycle

The goal of discussing selected examples in this section is to demystify and normalize the use of Bayesian methods in various scenarios and demonstrate that the risk of using these approaches may not be as high as perceived by some stakeholders, and the advantages may be relevant for efficient drug development. Additional examples of the use of Bayesian approaches in the regulatory setting have been described elsewhere10, 11.

Generating substantial evidence

In many cases, having multiple studies to demonstrate drug efficacy and safety is required by regulators because of the scientific value of replication. There are several ways to undertake a multiple-study drug development programme, including conducting studies in parallel or in sequence. When carrying out clinical studies in sequence, which is natural and most common in drug development, Bayesian methods could provide a beneficial approach for generating substantial evidence of the treatment effect at reduced cost and time of development without sacrificing scientific credibility.

In the case of phase II studies that are carried out as a precursor to phase III, valuable data on the dose, the patient population and the posterior distribution of the treatment effect size on a primary response variable of interest can be used as a prior for phase III planning. Compared with phase II data, the phase III data will be generated on the same treatment, with a highly similar patient population, by the same sponsor, in a nearly contemporaneous time frame, often involving some of the same investigative sites. Thus, the phase II data can often be highly relevant for creating a prior for phase III, even if aspects such as the inclusion and exclusion (I/E) criteria change, or the treatment formulations change slightly. Such refinements in phase III can be easily handled by using discounting factors (that is, less weight or less borrowing of prior information) in the Bayesian analysis that are mutually agreeable to sponsor and regulator. In general, the degree to which the sample size of phase III studies can be reduced while maintaining suitably high power is directly related to the quantity and quality of the phase II data12 as well as the amount of borrowing of that information, which can be mathematically defined in the Bayesian analysis. Marked changes from phase II to phase III in any of the clinical trial design factors would result in larger discounting of the phase II data. Ruberg et al.13 present an example of this approach with further details and considerations.

When phase III studies are carried out sequentially — either two identical phase III studies carried out in sequence to defer cost and risk of development or phase III studies carried out in different disease states or patient populations (for example, sacubitril–valsartan in heart failure for reduced ejection fraction and subsequently in heart failure for preserved ejection fraction) — borrowing data and or information from initial phase III studies to form priors can also reduce the sample size, cost and time for subsequent clinical trials using a Bayesian approach, without reducing the quality of the inference about a beneficial treatment effect on the primary outcome. This potential reduction in sample size has further implications, especially in the evaluation of safety.

First, it is worth noting that even with the use of prior information and Bayesian approaches, some situations may still require large sample sizes — although smaller than without the use of prior information — in phase III for demonstration of a beneficial treatment effect, thereby creating a sufficient safety database to assess the benefit–risk trade-off of the new treatment. However, in other situations, the Bayesian approach, although still providing credible evidence of a treatment effect, can result in fewer patients being exposed in clinical trials and thus less overall evidence about the efficacy and safety of an investigational product. From an efficacy perspective, there may be less opportunity to assess the treatment on secondary end points or in subgroups of patients that may be of interest. Having fewer patients in phase III RCTs is particularly important in the context of safety assessments of a new treatment. Even in some traditional drug development programmes, an evaluation of efficacy requires smaller sample sizes, but phase III RCTs are designed with very large sample sizes and over-powered to create a large enough safety database. Whether it be a traditional frequentist drug development programme or one using a Bayesian paradigm, when fewer patients are needed to demonstrate a beneficial treatment effect, a sound alternative is to collect additional safety data outside the context of such complex and expensive efficacy trials. For example, simpler trials could be designed with fewer visits, fewer efficacy and quality of life procedures, less restrictive I/E criteria and so on to build the safety database. These simpler trials might also be more reflective of clinical practice and provide better insight into the safety issues with a new treatment under more normal conditions of use. Thus, a Bayesian approach could confirm the benefits of the new treatment in smaller — but more complex and expensive — trials while the entire clinical development programme can be used in more efficient ways to build better evidence regarding the safety of the new treatment.

Second, a Bayesian approach may also be helpful in synthesizing information across a drug development programme, which is often not powered to test statistical hypotheses about specific adverse events. Small numbers of unexpected adverse events are occasionally reported in a trial, and determination of whether such events are a true treatment effect or a spurious finding is difficult. Although evaluating unexpected adverse events is inevitably post hoc, most often sponsors and regulators make intuitive judgements regarding their prior belief of a causal link between the treatment and the unexpected safety finding. Although defining a prior in a post hoc way may seem contradictory, a Bayesian approach may help to formalize understanding of different perspectives and quantify the level of posterior belief for the treatment effect on such adverse events. Thus, a Bayesian analysis could be a more informative way of describing the potential risks of a new treatment based on the accumulation of safety data across a drug development programme. Such quantification is generally not suited for a frequentist null hypothesis significance testing approach, and p values are often not relevant in such situations.

Furthermore, the use of Bayesian methods has the potential to result in a more appropriate use of evidence generated in clinical trials. In particular, evidence from a trial for which a conventional frequentist hypothesis test fails to reach statistical significance still contributes towards a calculation that a treatment effect of particular magnitude has (or has not) been established rather than the trial simply being viewed as ‘failed’, as is often done in both a regulatory and an academic context. Certainly, some phase III trial and academic study ‘failures’ represent false negative findings, and Bayesian approaches can create a scientific basis to consider how evidentiary standards for ‘success’ are framed, giving an opportunity to tailor those requirements to each therapeutic setting. A recent re-analysis of a failed trial in the treatment of paediatric cardiac arrest with therapeutic hypothermia (p = 0.14) used a Bayesian approach to calculate a posterior probability of therapeutic benefit of 94%14. The authors argue that the results presented this way are in stark contrast to the original study conclusion that stated that therapeutic hypothermia did not confer a significant benefit15.

Lastly, for treatments given conditional or accelerated approval, subsequent phase III commitments for confirmatory trials could use the trial that is the basis for accelerated or conditional approval to form an appropriate prior for the confirmatory trial. Such post-approval commitments for additional trials tend to be more difficult to complete in the presence of the already marketed product, and a Bayesian approach could, under appropriate circumstances, be a low-risk regulatory approach to avoid large, expensive and potentially wasteful trials.

Supplementing data with an external control group

Bayesian augmented control designs allow researchers to reduce the number of participants required for a trial by incorporating, or borrowing, information on control groups from historical studies or, in rare diseases, well-designed natural history studies, without sacrificing power to detect an effect. The method used to borrow historical controls can vary across study types, and rigorous assessment of the external source is required to reduce bias16. For instance, bias can occur if the historical control sample is dissimilar to the current trial’s control arm or if the standard of care in medical practice has evolved over time. Thus, an important part of any study design is to be comfortable that the chosen design and the incorporation of historical data into the statistical analysis can result in reasonably unbiased estimates of treatment effect.

Bayesian augmented control designs have been employed effectively in early-stage oncology trials. In these studies, data on members of the control group are borrowed from other trials with similar demographics and disease characteristics. Ultimately, this method allowed for a new trial to use 15–20% fewer participants than would be required for a standalone clinical trial with a full, concurrent control group17. This same approach could be used in phase III trials to create an even larger impact on the efficiency of clinical drug development18, including borrowing control data from studies in other therapeutic areas.

Acceptance of this method has grown. The FDA has accepted trials using Bayesian augmented control designs into the Complex Innovative Design Program (see Related links). It would be beneficial to write a publication to describe innovative trial designs and share lessons on important points to consider in advance of trial results coming out. Publishing these studies would allow others to learn more about the implementation of innovative designs, expanding the field’s knowledge and experience. Additionally, it would help to develop best practices for investigations, to clarify assumptions related to the relevance of data from one source to another and to open discussion surrounding methods of adjustment to address deviations between data from the current trial and previously collected data.

Bayesian hierarchical models

Both Bayesian and frequentist hierarchical models are helpful because they allow us to assess different sources of variation in the data and account for variables at multiple levels of analysis19, 20 (Box 4). For instance, we can examine how a person’s symptoms change throughout a trial as well as differences that may occur at a group level. These methods also allow for borrowing of external data, under certain assumptions. This can be particularly helpful when investigating treatment effects across subgroups.

Using a Bayesian hierarchical modelling approach involves creating submodels that use both prior information and the available data to estimate the parameters of the posterior distribution. The hierarchical model is created by combining these submodels, and the overall model accounts for uncertainty present at all levels. Further, in the process of creating a Bayesian hierarchical model, the researcher quantifies their assumptions and priors and makes them explicit in the model. This increases transparency compared with models focused on a single level of analysis, where such assumptions may be used implicitly to interpret statistical results. Bayesian hierarchical models have been used in a wide variety of drug development contexts, such as investigating subgroup findings and establishing drug safety.

Box 4 Bayesian hierarchical models

Bayesian hierarchical models allow us to examine sources of variation at various levels of analysis. At the top of the hierarchy is the overall treatment effect in the population of patients defined by the inclusion and exclusion criteria for a clinical trial. That overall treatment effect may be built upon subdivisions of the data that are nested in a way to make a hierarchical schema (see figure). The groupings at each level share some common attributes, and the relationship within and between groupings can be used to make more precise inference about a treatment effect that may differ between groups. In the schema shown in the figure, level 2 may include further subdivision of patients into refined subgroups.

Hierarchies can be quite general and represent many different scenarios of clinical interest. For example, level 1 in the schema may represent different subgroups of patients defined by phenotypic, genotypic or genomic factors. The hierarchical model allows for an overall treatment effect estimate but also a distinct treatment effect estimate in each subgroup. In practice, the subgroup treatment effect estimates will differ from the overall treatment effect estimate, and the fundamental question is whether such differences represent true heterogeneity of the treatment effect or merely random fluctuations due to sampling variability and the variability of the clinical outcome of interest. As described in the therapeutic hypothermia for hypoxic–ischaemic encephalopathy example in the main text, the Bayesian approach ‘shrinks’ the observed subgroup treatment effect estimates towards the overall treatment effect estimate, depending on the prior and how much weight is given to that prior. The FDA’s Impact Story on using innovative statistical approaches has some practical examples from actual clinical trials to describe this in more detail (see Related links).

Other hierarchical models may include different studies at level 1 with the same or different treatments at level 2. This approach was taken in the early 2000s for what is arguably the first FDA approval of a new treatment — a combination of pravastatin plus aspirin — using a Bayesian approach to estimate the treatment effect as the primary efficacy analysis61. Various models were examined to account for differences between studies, and prior distributions for all parameters in the model were defined explicitly. The result was that pravastatin–aspirin combination was superior to placebo, and in fact, the effects were synergistic (the effect of the combination exceeds the additive effect of pravastatin plus the effect of aspirin) based on a posterior probability of 0.9999 of the synergistic effect.

As another illustration of a Bayesian hierarchical model, we may be interested in the effect of a treatment on a certain outcome for which we have a model that describes the probability of a patient having that outcome (overall treatment effect). But the effect of the treatment depends on a patient’s compliance with the treatment regimen, for which we may have a different model describing the probability or extent to which the patient adheres to the treatment regimen (level 1 of the hierarchy). Such a model can be used to estimate the posterior distribution of each model parameter — the probability of treatment adherence and, subsequently, probability statements about the treatment effect.

Investigating subgroup findings

The safety or efficacy of a drug may differ for subgroups of participants. This is a vexing problem in clinical development as the analysis of multiple subgroups can lead to spurious or false positive findings21, which are sometimes referred to as ‘random highs’ or ‘random lows’ in response (see the FDA’s Impact Story on using innovative statistical approaches in Related links). That is, when clinical trial data are partitioned in many ways, to create many subgroups, there are more likely to be larger or smaller treatment effects within individual subgroups than the expected true effect in such a subgroup. Bayesian hierarchical models offer one approach to examining findings in a subgroup of people with similar demographic or clinical traits by using prior information or biological mechanisms to produce more reliable conclusions.

These subgroup investigations can take two forms: purely descriptive (for example, age, gender, ethnicity) where there is a basis to postulate that these do not modify effects; or investigations of whether drug effects are truly heterogeneous across subgroups as a step towards personalized medicine. Bayesian hierarchical models account for individual differences in the subgroup of interest at one level and borrow strength from the full model, which can decrease spurious findings and lead to more accurate treatment effect estimates19. However, for appropriate use, the assumptions must be plausible, and researchers must be careful in making assumptions about consistency across subgroups based on insufficient information.

Bayesian hierarchical models have been effectively used to investigate treatment effects in subgroups of patients with non-small-cell lung cancer (NSCLC). For instance, the Biomarker-integrated Approaches of Targeted Therapy for Lung Cancer Elimination (BATTLE) project — which was “the first completed prospective, biopsy-mandated, biomarker-based, adaptively randomized study in pretreated lung cancer patients” — used a Bayesian hierarchical model to examine the effectiveness of several targeted therapies for patients with NSCLC according to their biomarker status22. Patients were initially randomized equally to four treatments. As clinical outcome data accumulated over the course of the trial, a Bayesian hierarchical model was used to assess subgroups of patients with specific biomarker signatures to identify the treatment that was most likely to be beneficial for biomarker-specific patients based on a Bayesian posterior probability of the treatment effect. Randomization probabilities were adapted accordingly so that subsequent patients were more likely to get the most effective treatment according to their biomarker signature.

The Bayesian hierarchical model approach identified subgroups in which the treatments would be effective better than independent analyses conducted in each subgroup23. Further, in combination with other approaches, such as adaptive design, Bayesian hierarchical models can reduce sample size and allow faster completion of clinical trials24. The success criterion for the trial was prespecified as a Bayesian posterior probability of >80% that a study treatment achieved a 30% disease control rate (DCR) at 8 weeks after randomization, and the overall DCR at this point was 48.6%. The study was considered a success in “establishing a new paradigm for personalizing therapy for patients with NSCLC.”22

Drug and vaccine safety

Bayesian hierarchical models have also been used to examine the safety of an experimental intervention. For instance, results from a measles–mumps–rubella–varicella (MMRV) vaccine trial were re-analysed using this approach25. This Bayesian hierarchical model accounted for adverse events at three separate levels, including type of adverse event, the body system affected and all of the body systems together, which allowed information across different subgroups, or body systems, to be borrowed, to increase power. However, it also demonstrated that assignment to subgroups, in this case body systems, could alter outcomes, suggesting that subgroups should be identified on the basis of expert knowledge, not just by relying on statistical correlation. Furthermore, assessing the safety of a treatment can be a vexing multiple inference problem owing to the many types of adverse event that occur in clinical trials. Using hierarchical models to account for multiplicity issues related to drug safety assessments has also been proposed26.

Extrapolation

Extrapolation refers to an approach whereby information obtained from one or more subgroups of the patient population is applied to make inferences for another population or subgroup. This can reduce the number of patients in the latter group that need to be exposed to generate conclusions of the same scientific rigour. When there are data from prior trials and it is determined to be relevant, Bayesian methods could be applied to allow prior knowledge to be included in future studies.

Extrapolation has been successfully achieved in various contexts, including extrapolation across species of infectious bacteria, across body systems and across age groups. Extrapolation can be relevant when one wants to apply information from a well-studied population or body site to one that is less studied. For example, data from studies with ambulatory boys with Duchenne muscular dystrophy could be extrapolated to inform design and analysis of studies in those who are non-ambulatory. Although extrapolation techniques exist in both Bayesian and frequentist frameworks, Bayesian methods can be used to extrapolate from a source population to a target population by directly using data from the source population to inform the prior distribution. Quantifying the extent to which treatment effects in the source population apply to a target population is complex. A Bayesian approach has the possibility to address uncertainties related to the use of data from a source population by building an appropriate prior in which the treatment effect distribution reflects that uncertainty. Bayesian methods can explicitly quantify the uncertainty of extrapolation and also allow for source information to be down-weighted, thereby allowing the data from the target population to be weighted more heavily in the creation of the posterior distribution27.

Extrapolation from adult to paediatric populations is often of interest, and regulatory guidances exist28, 29, including an FDA guidance for medical devices that explicitly describes the use of Bayesian hierarchical models29. Although no such FDA guidance exists for therapeutic treatments, Bayesian methods have been used to successfully extrapolate from adult populations to paediatric populations. For example, Gamalo-Siebers et al.30 used several types of Bayesian model to extrapolate from information learned about the efficacy of the Crohn’s disease therapy rituximab in adults to provide insight into the efficacy of the drug in paediatric populations. They found that borrowing data from adults led to more precise drug efficacy estimates for children and advised that confidence in the Bayesian estimates of the treatment effect can be increased with proper planning — clearly stated assumptions, evaluating model fit, justification of priors, compatibility of the target (paediatric) and reference (adult) populations and more. The resulting reduction in sample size, which directly affects the cost and duration of a trial, can lead to greater efficiency in development and approval of medications for paediatric populations.

Decision-making for an ongoing clinical trial

Bayesian methods can be used in several ways to facilitate the workings of an ongoing clinical trial, including interim clinical trial monitoring and decision-making, utility analysis and sample size re-estimation. A Bayesian approach to monitoring trial progression can be helpful to assess accumulating data and make modifications, such as modifying or stopping the trial for safety or efficacy reasons, altering sample sizes or altering randomization procedures to favour certain arms of the study31 (Box 3). An emerging and promising use of Bayesian methods for ongoing clinical trials is in the design and analysis of master protocols, which include basket, umbrella and platform trials32. Such protocols often involve adaptive features and Bayesian decision rules for futility or advancing an experimental treatment for confirmatory clinical trials33. Because the Bayesian approach offers such flexibility, it is important to discuss these options with regulators and other involved parties to ensure satisfactory evidence is collected.

The Bayesian adaptive approach was used successfully to compare the efficacy and safety of dulaglutide and sitagliptin for treating type 2 diabetes mellitus34. In the first part of this study (phase II), the researchers aimed to determine whether dulaglutide was effective and, if so, the optimal dosage of dulaglutide. In this trial, randomization probabilities were adapted based on biweekly interim analyses using Bayesian decision rules regarding the probability that dulaglutide was superior to placebo and non-inferior to sitagliptin. Patient data were analysed every 2 weeks to adjust the randomization probabilities to the seven dulaglutide dose levels that were studied. Bayesian probabilities were also used to assess whether the phase II portion of the trial should be terminated for futility. The Bayesian interim analyses ultimately helped to select the optimal doses of dulaglutide to pursue in the second part of this study (phase III), which was highly successful. The Bayesian approach allowed for seamless integration of data across the phases of the study for making statistical inference about the treatment effect.

Another emerging trial design and analysis paradigm that may have great potential for rare diseases is the small sample, sequential, multiple assignment, randomized trial (snSMART). In such designs, patients who do not benefit sufficiently from their initial randomized study treatment are re-randomized (that is, crossed over) to other treatments in the study, which can be different treatments or different doses of the same treatment. Data from both randomization stages of the design are combined to estimate the treatment effect of all treatments involved. An example of such a design is a randomized multicentre study for isolated skin vasculitis (ARAMIS) comparing the efficacy of three drugs: azathioprine, colchicine and dapsone35. Newly developed methods using Bayesian joint stage models of such designs36, 37 have demonstrated the possibility of reducing sample sizes by 15–60% while maintaining the validity of the inference about a treatment effect.

留言 (0)

沒有登入
gif