ATRAcTR (Authentic Transparent Relevant Accurate Track-Record): a screening tool to assess the potential for real-world data sources to support creation of credible real-world evidence for regulatory decision-making

The use of real-world data (RWD) and real-world evidence (RWE) derived from RWD has seen wide adopted by pharmaceutical developers and a variety of decision makers including doctors, payers, health technology authorities and regulatory agencies (Berger et al. 2015; Berger et al. 2017; Schneeweiss et al. 2016; Berger and Crown 2022; Daniel et al. 2018; Zou et al. 2021). Credible RWE can be created from good quality RWD from routine practice when investigated within well-designed and well-executed research studies (Schneeweiss et al. 2016; Berger and Crown 2022). Adoption and use of RWD is complicated by concerns regarding whether particular sources of RWD are of “good quality” and “fit-for-purpose”. These concerns have become more urgent as regulatory agencies are increasingly using RWD as external comparators for single-arm clinical trials and are exploring whether non-interventional RWD studies can provide substantial supplementary evidence of treatment effectiveness. While the recent emphasis on data quality (DQ) has focused on the use of RWD for assessing disease burdens and treatment effectiveness, evaluation of DQ and fitness-for-purpose is also required for safety studies. However, expanding the use of RWE in safety evaluation will probably require data sources beyond administrative claims (Dal Pan 2022).

The US Food and Drug Administration’s (FDA) guidance, “Assessing Electronic Health Record and Medical Claims Data to Support Regulatory Decision Making,” states that for all study designs, it is important to ensure the reliability and relevance of data used to help support a regulatory decision (FDA 2021a). Reliability included data accuracy, completeness, provenance, and traceability; relevance includes key data elements (exposures, outcomes, covariates) and a sufficient number of representative patients for the study. The FDA guidance “Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products” (FDA 2023) emphasizes the need for early consultation with the FDA to ensure the acceptability of study design and analytic plans. With respect to data sources, it states that feasibility of data access is critical: “such evaluations of data sources or databases for feasibility purposes serve as a way for the sponsor and FDA to (1) assess if the data source or database is fit for use to address the research question being posed and (2) estimate the statistical precision of a potential study without evaluating outcomes for treatment arms.”

The European Medicines Agencies (EMA) in the European Union (EU) has also issued a draft “Data Quality Framework for EU medicines regulation” (EMA 2022). It defines DQ as fitness for purpose to user needs in relation to health research, policy making, and regulation and the data reflect the reality which they aim to represent (TEHDS EU 2022). It divides the determinants of DQ into foundational, intrinsic, and question specific categories. Foundational determinants are those that pertain to the processes and systems through which data are generated, collected and made available. Intrinsic determinants pertain to aspects that are inherent to a specific dataset. Question specific determinants pertain to aspects of DQ that cannot be defined independent of a specific question. It also distinguishes three levels of granularity of DQ: value level, column level, and dataset level. The dimensions, including subdimensions, and metrics of DQ are divided into the following categories: reliability, extensiveness, coherence, timeliness, and relevance.

Reliability (precision, accuracy, and plausibility) evaluates the degree to which the data correspond to reality.

Extensiveness (completeness and coverage) evaluates whether the data are sufficient for a particular study.

Coherence examines the extent to which different parts of a dataset are consistent in the representation and meaning. This dimension is subdivided into format coherence, structural coherence, semantic coherence, uniqueness, conformance, and validity.

Timeliness is defined as the availability of data at the right time for regulatory decision making.

Relevance is defined as the extent to which a dataset presents the elements required to answer a research question.

TransCelerate has issued a simpler framework entitled “Real-World Data Audit Considerations” that is divided into pillars of relevance, accrual, provenance, completeness, and accuracy (TransCelerate 2022). These frameworks are part of an ongoing dialogue among stakeholders from which international standards for RWD will eventually emerge.

There are a number of efforts to assess the utility of real-world data sources for a variety of purposes and settings. For example, Observational Health Data Sciences and Informatics (ODHSI) has open source tools, such as ACHILLES and Data Quality Dashboard, to be leveraged in the development of the DARWIN (Data Analytics and Real-World Interrogation Network) database in the EU. Other efforts have focused on the quality of prospective registries, which have different issues for DQ compared with the re-use of existing data sources including the Registry Evaluation and Quality Standards Tool (REQueST) developed by EUnetHTA (EUnetHTA 2019) and “Registries for Evaluating Patient Outcomes: A Users Guide: Fourth Edition,” developed by U.S. Agency for Healthcare Research and Quality (Glicklich et al. 2020).

In a systematic assessment of DQ evaluation, Bian et al. (2020) identified twelve DQ dimensions (currency, correctness/accuracy, plausibility, completeness, concordance, comparability, conformance, flexibility, relevance, usability/ease-of-use, security, information loss and degradation, consistency, and understandability/interpretability). They concluded that definitions of DQ dimensions and methods were not consistent in the literature; they called for further work to generate understandable, executable, and reusable DQ measures. To that end, we have developed a user-friendly set of screening criteria to help researchers of varying experience assess whether existing reusable RWD sources may be fit-for-purpose when their objective is to answer questions from regulatory agencies or to support claims regarding benefits and risks of therapies.

We took our cue on the definition of “fit-for-purpose” from the FDA draft guidance on selecting, developing, or modifying fit-for-purpose clinical outcome assessments (COAs) for patient-focused drug development guidance (to help sponsors use high quality measures of patients’ health in medical product development programs) which states that fit-for-purpose in the regulatory context means the same thing as valid within modern validity theory, i.e., validity is “the degree to which evidence and theory support the interpretations of test scores for proposed uses of tests,” and that a clinical outcomes assessment is considered fit-for-purpose when “the level of validation associated with a medical product development tool is sufficient to support its context of use”(FDA 2022). While the term validity is defined in epidemiology to be comprised of internal and external validity relating to study design and execution, we designed the RWD screening tool to focus on evaluation of the RWD itself within the larger framework of modern validity theory (Royal 2017).

After all, as Wilkinson notes, “good data management is not a goal in itself, but rather is the key conduit leading to knowledge discovery and innovation, and to subsequent data and knowledge integration and reuse by the community after the data publication process” (Wilkinson 2016). Wilkinson proposed the FAIR principles for the management of RWD generated by public funds (although they are also applicable to datasets created in the private sector) (Wilkinson 2016); data sources should be findable, accessible, interoperable, and reusable. These recommendations are complemented by the recommendations of the Duke-Margolis white paper “Determining Real-World Data’s Fitness for Use and the Role of Reliability” (Mahendraratnam et al. 2019) that explored whether RWD are fit-for-purpose by the application of rigorous verification checks of data integrity.

While experts in modern validity theory have not reached consensus on the attributes of validity, there are basic tenets that most such theorists have adopted (Royal 2017). Validity pertains to the inferences or interpretations made about a set of scores, measures, or in this case—data sources, as opposed to their intrinsic properties. As applied to evaluation of RWD sources, this means that they must be considered fit-for-purpose for generating credible RWE through well-designed and well-executed study protocols to inform decision making. Modern validity theory would suggest that the accumulation of evidence should be employed to determine if this inference regarding RWD quality is adequately supported. Hence, validity of a data source is a judgement on a continuum onto which new evidence is added and is assessed as part of a cumulative process because knowledge of multiple factors (e.g., new populations/samples of participants, differing contexts, new knowledge, etc.) are gained over time. This element of RWD source evaluation is not specifically recognized in the current recommendations by the FDA and the EMA.

As noted earlier, an obstacle to developing a consensus regarding evaluation of DQ is that many terms have been used to describe their dimensions and elements, and the terminology has been used inconsistently, despite efforts at harmonization (Kahn et al. 2016, Bian et al. 2020). Regardless, RWE derived from RWD that focuses on the natural history of disease and adverse effects of treatment have long been considered “valid” by decision-makers. Recently, the issue of data validity has urgently become a focus for regulatory initiatives as RWE derived from RWD is being expanded in its use to inform decisions about treatment effectiveness and comparative effectiveness. These decisions demand a greater level of certainty in order to trust the study results.

A crucial dimension in assessing the validity of data calls for the need for transparency through traceability and accessibility. The FDA reinforced this point in several recent guidance documents, as has the HMA-EMA (European Union’s Heads of Medicines Agencies-European Medicines Agency) Joint Big Data Taskforce Report (HMA-EMA 2019). The FDA guidance on “Considerations for the Use of Real-World Data and Real-World Evidence to Support Regulatory Decision-Making for Drug and Biological Products” states that “If certain RWD are owned and controlled by other entities, sponsors should have agreements in place with those entities to ensure that relevant patient-level data can be provided to FDA and that source data necessary to verify the RWD are made available for inspection as applicable.” (FDA 2023). The FDA noted in its “Data Standards for Drug and Biological Product Submissions Containing Real-World Data Guidance for Industry” that “during data curation and data transformation, adequate processes should be in place to increase confidence in the resultant data. Documentation of these processes may include but are not limited to electronic documentation (i.e., metadata-driven audit trails, quality control procedures, etc.) of data additions, deletions, or alterations from the source data system to the final study analytic data set(s)” (FDA 2021b).

Interests in the creation of “regulatory-grade” RWD and RWE in the US was spurred by the 21st Century Cures Act. In the EU, there is the Innovative Medicines Initiative (IMI) Get Real and the HMA-EMA Big Data Joint Taskforce. As noted in the Framework for FDA’s Advancing Real-World Evidence Program for early regulatory engagements to evaluate the potential use of RWE for the support of a new indication for a drug already approved or to help satisfy drug post approval study requirements, the strength of RWE submitted in support of a regulatory decision will depend on its reliability that encompasses more than transparency in data accrual and quality control, but also clinical study methodology, and the relevance of the underlying data (FDA 2018, 2023).

The National Institute for Health and Care Excellence (NICE) in the United Kingdom issued a real-world evidence framework in 2022 (NICE 2022). Among the elements to address data suitability were data provenance and governance, DQ including completeness and accuracy, and data relevance (data content, differences in patients and care settings, sample size, length of follow-up). They developed the DataSAT Tool that sought information on data sources, data linkages, purpose of data collection, description of data collected, time period of data collection, data curation, data specification (ex. data dictionary) and data management/quality assurance.

The European Medicines Regulatory Network strategy to 2025 includes the creation of the DARWIN [Data Analytics and Real-World Interrogation Network] (Arlett 2020; Arlett et al. 2021); it builds on the HMA-EMA Big Data Joint Taskforce Report (HMA-EMA 2019) that RWD is challenged by a lack of standardization, sometimes limited precision and robustness of measurements, missing data, variability in content and measurement processes, unknown quality, and constantly changing datasets. The report viewed the number of EU databases that currently meet minimum regulatory requirements for content and that are readily accessible, citing Pacurariu et al. (2018), as “disappointingly low”. The International Coalition of Medicine Regulatory Authorities (ICMRA) has called for global regulators to collaborate on standards for incorporating real-world evidence in decision making (ICMRA 2023).

In developing a user-friendly set of criteria, we attempted to find the right balance between the granularity of requested information and the response burden. We defined the dimensions of data suitability (e.g., sufficiency of quality and fitness-for-purpose) in plain English terms consistent with existing frameworks as discussed above. Although we primarily focus on the US and EU, the tool may have relevance to other jurisdictions as well, with the understanding of various local data privacy protection requirements for data-access.

留言 (0)

沒有登入
gif