The TRIM study uses a composition of pharmacoepidemiologic real-world data (RWD) analyses and decision science approaches. In Aim 1, we develop the TRIM decision support tool. Aim 2 comprises RWD analyses to select candidate medications for TRIM scoring. Aim 3 obtains TRIM input parameters for 24 teratogenic medications.
2.2 Aim 1—TRIM DevelopmentTeratogenic Risk Impact and Mitigation is envisioned to comprise explicit, quantifiable criteria, which are developed building on the FDA’s proposed framework for risk mitigation of teratogenic drugs [7], using input from a national panel of experts who are familiar with clinical aspects of drug use during pregnancy and the prevention of prenatal exposure or are familiar with benefit-risk decision making in clinical and regulatory practice. The TRIM criteria are established using modified Delphi methods and multi-criteria decision analysis (MCDA) [13].
2.2.1 Expert Panel MembersFor the national expert panel, we will recruit a multidisciplinary group of at least 30 clinicians, researchers, government and patient representatives. Our selection criteria will include familiarity with REMS, drug benefit-risk assessments and/or familiarity with clinical aspects of drug use during pregnancy and prevention of prenatal exposure. Hence, particularly relevant recruitment pools for experts, including patient representatives, are members of related FDA and Center for Disease Control and Prevention (CDC) advisory committees. Input from FDA staff regarding selection criteria will be obtained during monthly meetings. Prior to the first expert panel meeting, we will meet with patient representatives individually to ensure sufficient time to introduce the project, provide concrete examples of the tasks at hand, and to answer questions.
2.2.2 TRIM Criteria DevelopmentTo develop TRIM criteria, we will use a modified Delphi method, which supplements questionnaire-based individual ratings with virtual group meetings. The Delphi method is an iterative process that uses repeated rounds of voting to achieve expert consensus in scenarios where individual opinion is important [14]. Voting results as well as comments of Delphi members in support of their vote are shared in aggregate in between voting rounds to inform subsequent votes, while diminishing disproportionate influence of dominating group members. Delphi methods are an established approach in healthcare settings, e.g., in the development of guidance for clinical or regulatory decision making [15,16,17,18].
During the first virtual meeting, the expert panel will arrive at a comprehensive set of candidate criteria that are used to determine the need for a REMS. Valuable context is available from the FDA advisory committee meeting in 2012, at which a decision framework was introduced [7]. At this meeting, the FDA shared results from an internal review of a convenience sample of teratogenic drugs with a diverse range of risk mitigation approaches (Fig. 1). Of note, specific criteria shared with the expert panel will be limited to those included in the FDAAA to ensure extensive brainstorming and capture of a broad range of perspectives. The key question that will drive the brainstorming session will be: “What criteria should be considered when deciding whether a drug needs a REMS?” Each proposed criterion must be quantifiable, either based on epidemiologic data or via an assessment on a grading scale.
Fig. 1Drugs included in FDA’s review of decision criteria about risk mitigation (presented at the Drug Safety and Risk Management Advisory Committee meeting held December 12, 2012). BW boxed warnings, CAT category, CP communication plan, ETASU elements to assure safe use, ETASU A prescriber training/certification, ETASU B dispenser certification, ETASU D documentation of safe-use conditions, ETASU E patients subject to monitoring, ETASU F patient registry, FDA US Food and Drug Administration, MG medication guide, preg pregnancy, REMS Risk Evaluation and Mitigation Strategy, WP warnings and precautions. *Had RiskMap prior to REMS. †REMS officially approved in 2012
Criteria proposed during the virtual brainstorming sessions will be summarized and presented to the panel for individual vote using an online survey, including evaluations of criteria completeness, uniqueness, and relevance.
Once the final set of criteria has been selected, the panel will decide on the metrics that will be used to score each criterion. Anticipated metrics derived from epidemiologic studies are measures of absolute and/or relative risk of adverse outcomes, the risk of exposure during pregnancy, or measures of background utilization (Table 1). Grading scale-based metrics might utilize standard tools such as the GRADE practice recommendations to categorize quality of evidence [19]. Both the selection of final TRIM criteria and their corresponding metrics will occur during survey-based Delphi rounds, where the decisions and feedback from all members from the previous round are summarized and distributed to inform the next round, repeating the process until consensus has been reached. We define consensus as agreement by more than 75 % of panel members.
Table 1 Examples of potential Teratogenic Risk Impact and Mitigation (TRIM) criteria and metricsWe anticipate that the virtual meeting will stimulate brainstorming to allow capture of the broadest set of criteria, while the written Delphi rounds will leverage maximum group input, avoiding dominance of certain group members.
2.2.3 Trim Criteria WeightingOnce the final set of criteria and metrics have been developed, the expert panel will transition to determining criteria weights. Weight elicitation is a structured method of quantifying the relative importance of criteria to stakeholders [20]. The choice of a suitable weight elicitation method is based on several factors, including theoretical foundation, cognitive burden, feasibility, and robustness [21]. One of the most frequently-used quantitative approaches to evaluating the benefit-risk profile of drugs is multi-criteria decision analysis (MCDA) [22]. The MCDA methods allow seamless and transparent integration of multiple criteria with their relative importance as perceived by decision makers [23]. A score can be calculated, representing the benefit-risk of the drug according to the specified criteria, which stakeholders can use to inform related decisions. The explicit consideration of criteria importance and the transparency brought about by this framework ensures that the underlying process of arriving at any decision is consistent and clear. Multi-criteria decision analysis has key recommended properties for decision making (i.e., capacity to capture multiple drug effects, frequency and desirability; integration of disparate information types; dealing with uncertainty; transparency; soundness) and is particularly recognized for its practicality [24]. Relevant to TRIM as a criterion-based framework (in contrast to decision trees), MCDA can also accommodate risk-benefit to multiple entities (e.g., individual risk-benefit and herd immunity for a vaccine or mother and fetal risk) [24]. Within MCDA techniques, discrete choice experiments (DCE) and swing weighting, among others, are some of the recommended and well-studied methods for eliciting preference weights and have been endorsed by the European Medicines Agency [25,26,27,28]. In DCE, individuals make trade-offs between scenarios (choice sets) that vary in several attributes [29].
2.2.3.1 Choice Set DesignIn this DCE, choice sets will consist of hypothetical drug pairs where each drug varies regarding its composition of TRIM criteria levels. Choice set creation is done using the R statistical package idefix (v1.0.3) as it offers diagnostic tools to check the statistical properties of the design of the hypothetical choice sets [30]. To avoid cognitive fatigue when the same respondent is presented with too many choice sets, we will prioritize choice sets that force the respondent to make trade-offs to maximize the information gained in each set. Selection of choice sets relies on Bayesian approaches that introduce prior information on choice sets, which we glean from real-world examples (i.e., TRIM criterion level compositions of approved teratogenic medications). Reliance on real-world compositions of TRIM criterion levels retains the relevance of the DCE while minimizing expert panel burden.
2.2.3.2 Development and Administration of the DCE Survey InstrumentExamples of choice sets will be presented to the expert panel during a virtual group meeting before beginning the actual choice task within an online survey, with explanations about what a possible choice means (Fig. 2). Members will individually complete the survey instrument. To ensure that the survey instrument is user-friendly, we will conduct a pre-test among 10 participants selected from University of Florida (UF) faculty and graduate students who have some familiarity with REMS decision making.
Fig. 2Example of choice sets reflecting potential Teratogenic Risk Impact and Mitigation (TRIM) criteria to prioritize drugs for risk mitigation
Calculation of a sufficient sample size for a DCE is a growing field. Guidance on sampling methods is based on empirical work and rules of thumb [31]. Lancsar and Louviere report that it is unusual to go beyond 20 observations per choice set to estimate a stable and reliable model [29]. In this study, 30 experts on the Delphi panel serve as respondents to the DCE, providing ample observations for analysis.
2.2.3.3 Data AnalysisThe responses to the DCE will be analyzed using a mixed logit model [32], which allows for unobserved heterogeneity of preferences, and assigns random effects, simultaneously examining the effects of both the criteria and individual on choice [32]. The model estimates the value or utility each respondent attaches to the different levels of the criteria and how the criteria levels impact individuals’ choice. The deliverable of aim 1 is a fully developed decision tool, available as an Excel Add-in or internet-based scoring tool, which uses weights for explicit evidence-based criteria to generate an overall score indicative of the drug’s relative need for risk mitigation measures. The underlying model used is a weighted sum model [33].
2.3 Aim 2—Medication Selection for TRIM ScoringAim 2 will develop population-based estimates of prenatal exposure to known teratogenic medications to inform selection of drugs that are subjected to TRIM scoring.
2.3.1 Data SourcesTo enhance the generalizability of our results, we will conduct parallel analyses in Medicaid using Medicaid Extract Files (MAX) and their newer version, the T-MSIS Analytic Files (TAF), and in a privately insured population using MarketScan® Commercial Claims data. With some variation across states, Medicaid covers nearly half of live births in the USA [34]. Medicaid data files include diagnoses and procedures associated with inpatient and outpatient medical encounters, outpatient pharmacy records of dispensed prescriptions, and sociodemographic and enrollment information for all Medicaid beneficiaries. MarketScan includes a national sample of patients in employer-sponsored health insurance plans and provides the same detail on medical encounters and outpatient medication use as Medicaid.
2.3.2 Study Design and ParticipantsWe will develop two retrospective cohorts defining two different source populations for estimation of prenatal exposure to teratogenic drugs, including a pregnancy cohort and a cohort of persons of child-bearing potential aged 12–55 years who have identified as female in the datasets, with no history of infertility identified from any medical encounter with diagnosis or procedure codes for bilateral oophorectomy, hysterectomy, total abdominal hysterectomy, sterilization, premature menopause, or natural menopause, and who are using a given teratogenic medication. The cohorts will be established using Medicaid data from 2010–2019 and MarketScan® data from 2010–2021.
2.3.3 Pregnancy IdentificationWe will employ our previously developed algorithm to identify pregnancy episodes in claims data, including both live birth pregnancies (full-term, pre-term, post-term) and non-live birth pregnancies (ectopic, spontaneous and induced abortions, and stillbirths) [35,36,37]. The algorithm uses validated measures for pregnancy endpoints and estimates gestational age to determine the date of the last menstrual period (LMP) [38,39,40,41,42].
2.3.4 Teratogenic Drugs ExposureAs part of previous work [2], we have established a list of teratogenic medications based on relevant monographs such as the Teratogen Information System (TERIS) [43] and Clinical Pharmacology®. In brief, two pharmacists independently selected all medications available in the USA during the study period that had either a known or potential risk for teratogenicity based on evidence ratings. We excluded benzodiazepines, opioids, obesity drugs if the mechanism of harm was solely attributed to weight loss during pregnancy, statins, tetracyclines, sex hormones (estrogens, progesterone/progestins, testosterone), drugs with infertility treatment indication (ganirelix, lutropin alfa, menotropins, letrozole, etc.), gonadotropin-releasing hormone analogs, abortion treatments (short-term misoprostol and mifepristone), and post-partum/abortion hemorrhage treatments (ergonovine and methylergonovine). These medications are either frequently used during pregnancy for medical purposes or are believed to have minimal risk [1, 44].
Criteria for medications were further expanded to consider dose and timing during gestation as relevant for teratogenic risk. For example, even though fluconazole has shown significant teratogenic effects at higher doses (≥ 450 mg, classified as known teratogen), the impact of the more prevalent lower single dose administration is not entirely clear, which was accordingly classified as potentially teratogenic.
We use National Drug Codes on pharmacy records and Healthcare Common Procedure Coding System (HCPCS) codes on medical encounters to identify teratogenic drug exposure.
2.3.5 Estimates of Prenatal Exposure to Known TeratogensWe generate two estimates of prenatal exposure, considering pregnancy episodes and episodes of medication use. First, for each drug with known teratogenic risk, we estimate the proportion of pregnancies with prenatal exposure. We require that persons have continuous health plan enrollment from three months before conception until 30 days after the pregnancy end date. The post-pregnancy enrollment requirement allows capture of encounter information that aids in the estimation of gestational age. Prenatal exposure is assessed during the gestational window during which the drug is expected to exert teratogenic effects. For comparison, we also report prevalence estimates for the three months preceding conception. To increase the probability that drugs were administered and not discontinued before the dispensed days’ supply was exhausted, we require a prescription fill or medical encounter indicating drug administration during the relevant assessment window. For example, drug supply from a prescription fill prior to conception that reaches into the first trimester is not considered to determine prenatal exposure. If a pregnancy is exposed to more than one teratogenic drug on our assessment list, each drug is considered individually.
Second, to put prenatal exposure risk in the context of the background utilization of each drug, we estimate the incidence of pregnancy during episodes of medication use by persons of child-bearing potential. We construct drug use episodes based on prescription fills and medication administration codes on clinical encounters and overlay these episodes with pregnancy episodes to derive incidence estimates of prenatal exposure. Drug exposure begins at the date of the prescription fill or medical encounter indicating drug administration and ends at the end of the dispensed days’ supply (allowing for 7-day gaps between refills) or the duration of drug action for drugs administered in clinics. As with pregnancy-based estimates, the overlapping portion of the pregnancy must be consistent with drug risk timing, i.e., include the trimester during which the drug poses teratogenic risk. For drugs with low background utilization, we will collapse both data sources to facilitate stable estimates.
The 24 study drugs that are subjected to TRIM scoring in aim 3 will include all 12 drugs that have or had a REMS (Table 2) and the top 12 drugs originating from analyses of prenatal exposure risk. For the latter, we consider the following criteria in drug selection: first, drugs in the same class and with the same teratogenic mechanism will be considered together, e.g., ACE-inhibitors are considered as a class. Oncology drugs are grouped based on similar pharmacological properties. Second, frequency estimates of prenatal exposure across pregnancy episodes are prioritized over the medication use episodes as denominator. Third, if rankings differ between MarketScan and Medicaid populations, we will give priority to Medicaid results to focus on more vulnerable populations.
Table 2 Teratogenic drugs with current or previous Risk Evaluation and Mitigation Strategy (REMS) program2.4 AIM 3—Scoring Medications Using TRIMThe newly developed TRIM tool is used in aim 3 to score 24 drugs. This will be done to (a) calibrate TRIM, i.e., using TRIM scores for a selection of drugs that can be used as benchmark for the assessment of other drugs; and (b) to provide comparative scores for these 24 drugs (12 without a current or former REMS) to inform further regulatory decisions on risk mitigation.
To obtain input parameters for TRIM metrics, we will conduct literature searches and generate empirical evidence. We anticipate that TRIM input parameters will include, at a minimum, measures of teratogenic risk (frequency and severity) and estimates of utilization by persons of child-bearing potential and prenatal exposure patterns.
2.4.1 Evidence SearchThe evidence search focuses on retrieval of studies that have evaluated the risk and severity of teratogenic effects. For each study drug, we will review the information that is offered in TERIS, which provides rankings for the quality of evidence and the severity of risk based on the evidence available, considering reproducibility, consistency, and biological plausibility of available clinical, epidemiological, and experimental data [36]. We will supplement and update information in TERIS with PubMed searches, which consider outcomes involving teratogenicity, transplacental carcinogenesis, embryonic or fetal death, and other fetal and perinatal pharmacologic effects. In addition to searches of published literature, we will also retrieve relevant FDA documents including the application for drug approval and FDA review including risk mitigation decisions. This information will be particularly relevant for newer drugs with REMS where controlled human studies may not be available. All controlled studies retrieved from the TERIS bibliography and our literature searches are then assessed for quality using ROBINS-I, which is recommended by Cochrane Reviews for evaluating risk of bias for non-randomized studies [45].
If there are multiple studies with different outcomes, we will extract frequencies separately for each outcome, pending detail on the TRIM criteria. For example, in developing criteria for the TRIM tool, the expert panel might want to consider both the severity and the frequency of adverse effects, which could result in different combinations for different outcomes, e.g., minor malformation that is frequent versus major malformation that occurs infrequently.
2.4.2 Drug Utilization and Prenatal Exposure DataInputs for TRIM about background drug utilization or risk for prenatal exposure among medication users of child-bearing potential are obtained directly from aim 2. If TRIM includes other criteria that rely on real-world evidence and that are not considered here, we will amend the methods section of this protocol accordingly.
2.4.3 Grading of Evidence and HarmWe anticipate that some criteria that are selected by the expert panel will require a grading scale. Examples might include a metric for the quality of available evidence and the severity of the teratogenic effect. While the final decision about the type of utilized grading scale will rest with the expert panel, we plan to provide examples for review and discussion, such as the GRADE practice recommendations [19].
2.4.4 Trim Output and Opportunities to Optimize Risk Mitigation EffortsAfter all TRIM inputs are obtained, we will calculate the final TRIM score for the 24 study drugs. We envision a visual-analog scale to display TRIM scores, calibrated by our study drugs that can help determine the relative priority for risk mitigation for any other drug that might be considered for evaluation via TRIM (Fig. 3). We anticipate that the results may identify when a REMS should be considered in the overall management of teratogenic risk.
Fig. 3Example of risk mitigation priority scale based on Teratogenic Risk Impact and Mitigation (TRIM)
留言 (0)