The Use of Real-World Data for Estimating Relative Treatment Effects in NICE Health Technology Assessment Submissions: A Review

3.1 Included Submissions

Full details of the identification of relevant NICE submissions are presented in Fig. 1. In total, 569 HTA submissions were published between January 2016 and December 2023, including 549 single and 20 multiple technology appraisals across all disease areas. We excluded 99 terminated submissions and eight withdrawn submissions because the submission did not meet the NICE evidence requirements. We excluded an extra eight submissions that we did not have access to, which were updated and replaced by the new submission or guidance.

Fig. 1figure 1

PRISMA diagram of the included UK NICE HTA submissions

Of the remaining 454 HTA submissions, 195 were excluded after initial screening because they did not include the relevant keywords. Of the 259 HTA submissions assessed for eligibility, 195 were excluded after full-text review. Among the excluded, 70 (36%) used RWD for other purposes (not informing treatment effects), for example, informing input parameters in the decision-analytical model, such as costs, utilities, and transition probabilities, or the natural history of the disease; 59 (30%) submissions were excluded because RWD were considered only by citing existing published studies or real-world clinical practice (59 of 194 [30%]); 58 (30%) submissions included one of the keywords but no RWD were actually considered in the submission (i.e. a single-arm trial was compared with another single-arm trial). The exclusion reasons are summarised in Table S1 in the Supplementary Material.

After a full-text review, we included 64 HTA submissions (58 single and six multiple technology appraisals) that used RWD to directly derive treatment effects.

3.2 Characteristics of the Included Submissions

Figure 2 shows the change in the use of RWD in NICE HTA submissions over the review period. Except for 2016, the percentage of appraisals in which RWD were used to inform treatment effects per total submissions varied between 10 and 27%. There was an upward trend in the use of RWD over time, except for 2020.

Fig. 2figure 2

RWD use for relative treatment effects estimation in UK NICE HTA submissions from 2016 to 2023

Sources of RWD included disease registries, EHRs, and medical chart reviews. Most studies were international and multicentre. The main countries collecting RWD through registries were the UK, USA, France, and Germany. US data included a wider range of databases at the regional and/or national level, whereas RWD in other countries tended to come from national registries (e.g. UK Systemic Anti-Cancer Therapy) or single-institution EHRs (e.g., UK Clinical Practice Research Datalink).

A total of 53 submissions received positive recommendations, although approximately 75% of these were made with conditions such as managed access agreements, commercial arrangements, or 2-year stopping rule. Eleven appraisals were recommended for use in the UK Cancer Drugs Fund.

The most common diseases covered in the included submissions were lymphoma, non-small-cell lung cancer, carcinoma outside of the lung, and leukaemia (Fig. S1 in the Supplementary Material). Non-cancer areas included kidney disease, idiopathic pulmonary fibrosis, asthma, mastocytosis, lupus erythematosus, myelofibrosis, cystic fibrosis, hypertrophic cardiomyopathy, and COVID-19.

3.3 RWD Use and Sources3.3.1 External Control Arm

All 64 submissions included in this review considered RWD to construct an external control arm (ECA), which was then compared with a treatment arm in a single-arm trial or RCT. In the latter case, an ECA was required when the control arm of an RCT was not considered adequate because (1) the RCT’s control arm was outside the NICE scope or not reflective of UK clinical practice, (2) the comparator was undefined or inappropriate for UK clinical practice (e.g., country-specific treatment regimens and different lines of therapies), or (3) the randomised sample was small, which resulted in low statistical power due to a lack of an established comparator (e.g. orphan disease with no approved treatments).

The identification of a relevant real-world ECA (e.g., current clinical management, the standard of care) was generally informed by a systematic literature review and clinical expert opinion. However, the suitability and quality of RWD sources were not formally assessed using assessment tools or reporting checklists such as the Data Suitability Assessment Tool [2].

Non-UK real-world ECAs were accepted in 36 submissions (56.3% of 64 submissions). The main justifications for using non-UK ECAs were small sample size and survival endpoint availability. Accepted non-UK real-world ECAs did not necessarily have better data quality than UK real-world ECA; it was the most complete dataset available at the time of analysis. The limitations were acknowledged, such as non-UK setting and assumptions on patient characteristics with RWD; however, they were not treated differently from UK ECAs. Two-thirds (44 submissions) of the total submissions considered RWD for ECA analysis only in the base-case analysis. In nine submissions, RWD were considered in both the base-case and sensitivity analysis, primarily to assess the robustness of the base-case assumptions, for example with respect to the statistical approach. Eleven submissions considered RWD only in sensitivity/scenario analysis. In some instances, RWD provided evidence to complement RCT evidence in respect of disease types (rare tumours) or subgroup effects (e.g. disease severity, tumour expression). In other cases, RWD were considered in sensitivity analysis to assess the generalisability of the findings, for example, by using a jurisdiction-specific disease registry to assess the generalisability of the RCT evidence from elsewhere.

Table 1 summarises the number of RWD sources used in ECA analysis according to its application in base-case versus sensitivity analysis. A total of 108 individual RWD sources, including disease registries, EHRs, and chart reviews, were considered in the 64 included submissions. Although EHRs and disease registries are equally considered in the base-case analysis, the latter are more likely to be considered in the sensitivity analysis than the former.

Table 1 The number of RWD sources used in external control arm analyses across base-case versus sensitivity/scenario analysis according to the type of RWD. Numbers in brackets are submissions3.3.2 Extrapolation

In total, 12 submissions considered RWD (disease registries in all cases) to directly inform the long-term treatment effects.

In 10 submissions, RWD were used to calibrate the choice of the parametric curve. For example, the survival rates observed in disease registries could determine whether the parametric curves are underestimating or overestimating the survival of the control group. The other two submissions (TA396 and TA562) [13, 14] used RWD to adjust the long-term projections of the case-mixed model to inform treatment effects across different patient subgroups.

3.4 Analytical Strategies for Deriving Treatment Effects3.4.1 External Control Arm Studies

Figure 3 describes the statistical methods used to adjust for differences between treatment and real-world ECAs, focusing on the primary base-case analysis. Over one-third (14 of 44) of submissions performed a naïve comparison (no confounding adjustments) between treatment and the real-world comparator. The proportion of naïve comparisons varied over time but seems to have somewhat decreased in the last few years (Table S2 in the Supplementary Material). When confounding adjustments were performed (30 of 44), weighting was the preferred adjustment method (20 of 30), followed by matching, regression, and simulated treatment comparison (STC). Matching was particularly preferred over weighting or regression approaches when the RWD sample was large and/or model specification was judged more challenging. Potential confounders or effect modifiers were identified by systematic literature review and/or through discussions with clinical experts. Selected measured confounders were defined a priori, but additional confounding factors were sometimes considered in the sensitivity analysis, particularly when there was a lack of overlap in key prognostic factors between the single-arm trial and ECA groups.

Fig. 3figure 3

Statistical methods used to adjust for differences between treatment and external control arm in base-case analysis only (44 submissions)

The estimand of interest was reported in 52 submissions; 39 submissions reported an intention-to-treat analysis (average treatment effect). Appraisals often stated that this estimand was preferred by HTA decision makers because it tends to reflect the effect of the treatment policy in real-world practice (recognising that patients may discontinue or switch to alternative treatments). In nine submissions, the average treatment effect on the treated was adopted instead of the average treatment effect. For example, this was done when there was more than one comparator group and the baseline characteristics differed across the comparator group. Per-protocol effects (treatment effect with strict protocol adherence) were reported in four submissions. In eight submissions, more than one estimand was reported.

3.4.2 Extrapolation

Extrapolation of survival data was done by fitting alternative parametric survival curves to each trial arm and the real-world ECA independently. The most popular parametric distributions were Weibull, exponential, Gompertz, log-logistic, log-normal, and generalised gamma distributions. The choice of parametric survival curve (and hence long-term survival projection) was informed by the goodness-of-fit measures (Akaike information criterion/Bayes information criterion) and clinical expertise. In 20 submissions, the real-world ECA was deemed inappropriate for extrapolation for various reasons, such as (1) short follow-up, (2) inclusion of treatments not observed in UK clinical practice, and (3) key endpoints (e.g., progression-free survival) not collected. In particular, the Systemic Anti-Cancer Therapy registry was deemed too immature to be extrapolated despite UK patient representativeness. In such cases, alternative RWD or trials were chosen to extrapolate survival in the control group.

留言 (0)

沒有登入
gif