Although traditional randomized trials remain the gold standard for assessing efficacy and safety of novel treatments, the slow pace and uncertain generalizability of traditional trials have prompted a growing interest in real-world evidence (RWE), including pragmatic or real-world clinical trials conducted in community settings.1-3 Recognizing the need both for more relevant evidence and a for more efficient evidence-generating process, the National Academies of Science, Engineering, and Medicine Forum on Drug Discovery Development and Translation4 organized a series of workshops sponsored by the US Food and Drug Administration focused on Examining the Impact of Real-World Evidence on Medical Product Development.5 Those workshops considered specific dimensions in which RWE studies might differ from traditional clinical trials: use of real-world data, less standardized treatment delivered by community providers, and assignment of treatments by some mechanism other than individual randomization. Expanding on these considerations, there are certain features that distinguish traditional clinical trials from real-world clinical trials or pragmatic trials. In traditional trials, study participant eligibility follows stringent criteria that restricts to patients likely to have the outcome of interest, likely to be responsive to the experimental intervention, and likely to adhere to treatment protocols. Conversely, pragmatic trials will include all participants with the condition of interest regardless of predicted risk, estimated responsiveness, or adherence likelihood. Traditional clinical trials apply meticulously specified experimental interventions with experienced practitioners in settings selected predicated on expertise with patients enrolled in the study. Pragmatic trials will include practitioners across the spectrum of usual care and in the full range of clinical settings. Whereas in traditional clinical trials, the comparison intervention will be narrowly defined and may include a placebo, pragmatic trials will use a comparison that resembles usual care or best alternative treatment approach.6
As participants in the aforementioned workshop series, we here discuss two questions prominent in those discussions: When is concealment or blinding of treatment assignment necessary? How strictly should treatment quality or intensity be standardized? Both questions involve “real-world” adaptations of traditional practice in clinical trials: open-label treatment and allowing natural variation in quality or intensity of treatment. For each of these questions, we identify specific issues that evidence generators should consider when designing real-world studies and that evidence consumers should consider when evaluating the validity and relevance of study results. These questions are most relevant to the design of pragmatic clinical trials, where investigators have some control over delivery of treatments and information available to study participants and personnel. However, some of these questions may be relevant to design or interpretation of observational research.
ADVANTAGES AND DISADVANTAGES OF CONCEALING OR BLINDING TREATMENT ALLOCATIONThe original impetus for blinding patients, clinicians, and outcome assessors in clinical trials was reducing bias due to expectations or preferences.7 Those preferences could influence clinicians’ delivery of treatments, participants’ adherence to treatments or reporting of outcomes, and assessors’ evaluations of benefits or harms—all leading to systematic error. Potential biases introduced by unblinded or open-label treatment are most concerning when treatments are more complex or outcomes are more subjective. Although pragmatic trials often evaluate treatments in common use, sample sizes in pragmatic trials are typically much larger than those in traditional randomized clinical trials conducted prior to initial approval. Consequently, unbiased detection of less common adverse effects may be an important secondary aim of some pragmatic trials.
Blinding or concealment, however, may adversely affect both the efficiency of evidence generation and the generalizability of the resulting evidence. In addition to any direct operational costs of delivering blinded treatments, blinding may significantly reduce enrollment, leading to both added expense and delay. The Estonian Postmenopausal Hormone Therapy Trial included comparisons of a woman’s willingness to enroll for the nonblind and blind subtrials. The subject’s overall willingness was increased if the woman was in the nonblind subtrial with a relative risk of willingness to enroll of 1.17 nonblind vs. blind. Fewer exclusions in the nonblind arm, resulted in higher overall eligibility with a RR 1.10 nonblind vs. blind. Although similar numbers of patients met the first stage of eligibility and were randomized into the nonblind and blind subtrial (2,087 and 2,084, respectively), a larger proportion consented and were recruited in the nonblind arm (48.0% vs. 37.4%).8 Procedures necessary for blinding or concealment may also influence decisions to enroll or distort the delivery of study treatments, reducing generalizability to real-world practice conditions. This is clearest, for example, when alternative treatments differ in their modes of delivery (oral vs. parenteral) or their requirements for clinical or laboratory monitoring. Requiring unnecessary procedures to maintain blinding may obscure rather than reveal true differences between treatments in acceptability, adherence, and real-world effectiveness. In one of the few rigorous evaluations of the effects of blinding, the Estonian Postmenopausal Hormone Therapy trial8 included two parallel trial protocols, blinded and unblinded. The requirement for blinding both reduced the likelihood that potentially eligible participants would enroll and yielded a study population less representative of all patients potentially eligible.
Balancing the advantages and disadvantages of blinding or concealment depends on the specific study question, the nature of treatments being compared, and characteristics of the study settings. We describe below a series of specific questions to inform or guide design decisions regarding concealment or blinding, illustrating with examples from recent real-world clinical trials.
QUESTIONS TO INFORM CHOICES REGARDING CONCEALMENT OR BLINDING OF TREATMENT ALLOCATION Question 1: Will the providers, participants, and raters have expectations regarding likely benefits and adverse effects of study interventions?Even when investigators perceive equipoise between alternative treatments, study participants or treating clinicians may have strong preferences or expectations regarding differences (Table 1). In comparisons of new products with treatments in common use, both patients and clinicians may anticipate that a new treatment will be superior. In comparisons of treatments in common use, expectations may be influenced by media reports or direct-to-consumer marketing. As seen during the coronavirus disease 2019 (COVID-19) pandemic, patient and clinician perceptions regarding highly publicized treatments may be strongly influenced by media reports and high-profile endorsements.9 A priori, investigators should consider to what extent evaluations and opinions expressed in popular media, regardless of backing by robust evidence, may influence behaviors or study patients, clinicians, or raters.
Table 1. Considerations regarding blinding or allocation concealment Favors blinding or concealment Favors open-label treatment Participants, treating clinicians, and/or outcome raters expected to have strong preferences or expectations Participants, clinicians, and raters not expected to have strong preferences or expectations Treatment delivery, treatment adherence, or outcome assessment more likely to be affected by preferences or expectations Treatment delivery, treatment adherence, or outcome assessment unlikely to be affected by preferences or expectations Expectations or preferences in trial settings not expected to generalize to settings where trial results will be applied Expectations or preferences in trial settings are similar to those where results will be applied Concealing treatment assignment can reduce bias due to preferences or expectations Blinding is not feasible or is unlikely to reduce potential bias Blinding would not obscure meaningful differences between treatments in acceptability or adherence Blinding would distort or obscure differences between treatments related to real-world effectiveness Procedures necessary for blinding would not affect acceptability or risk of participation Procedures necessary for blinding could reduce acceptability or increase risk of trial participationExample: The CURVES trial10 compared efficacy and evaluated dosing of atorvastatin, simvastatin, pravastatin, lovastatin, and fluvastatin. Based on an expectation that participants and clinicians would not have strong preferences or expectations regarding differences among these similar treatments, neither participants nor clinicians were blinded to treatment assignment.
Question 2: How might preferences or expectations influence intervention adherence, the fidelity or intensity of the treatment delivered, or the reporting of beneficial or adverse effects?Participants’ or clinicians’ expectations, perceptions of treatments, or preferences could affect treatment delivery, adherence to study treatment, as well as the assessment of outcomes. The potential influence of preferences and expectations on treatment delivery or adherence would be expected to increase with treatment duration, treatment complexity, and the need for personalization or adjustment of treatment based on perceived beneficial or adverse effects. Preferences and expectations would be expected to have greater potential to bias assessment of outcomes requiring subjective assessments by participants or clinicians. Those biases could influence reporting or assessment of both benefits and adverse effects or potential harms.
As an example, in the CURVES trial10 comparing alternative statins using fixed dosing regimens, the efficacy end points were mean percent change in enzymatically measured plasma LDL cholesterol, total cholesterol, triglycerides, and high-density lipoprotein cholesterol concentrations over 8 weeks of treatment.11 Given the simple dosing profile for each of the study medications and the laboratory-based determination of study outcome, the opportunity for expectations to affect treatment delivery, patient medication adherence, or outcome assessment was limited. Consequently, neither participants nor clinicians were blinded to treatment assignment.
Question 3: How might those expectations or preferences differ in the settings where trial results will eventually be applied?If the expectations or preferences of study participants and clinicians are similar to those in real-world settings where study treatments would eventually be delivered, then the effects of those preferences or expectations on treatment delivery or treatment adherence could be considered a valid signal rather than noise or bias. In that case, open-label treatment could reveal, rather than obscure, differences between treatments likely to occur in subsequent real-world clinical use.
As an example, the PRIDE trial11, 12 compared the long-acting, monthly dosed, injectable paliperidone palmitate to oral antipsychotic medication in participants with schizophrenia. Blinding would have obscured differences in participants’ and clinicians’ experiences of daily oral medication compared with monthly injections—inherent differences between treatments expected to occur in real-world practice. Consequently, the trial compared open-label treatment, not requiring either placebo pills for participants assigned to injectable medication or placebo injections for participants assigned to oral medication.
Question 4: How might concealing treatment allocation from participants and/or providers reduce biases due to preferences or expectations?In scenarios where the participant, provider, or rater preferences may influence treatment delivery or assessment of outcomes, blinding is certainly preferable if it does not introduce significant burdens or distortions.
As an example, the INVESTED trial13, 14 compared high-dose trivalent and standard-dose quadrivalent influenza vaccine for prevention of death or cardiopulmonary hospitalization in participants with recent myocardial infarction or heart failure. Given the clearly defined treatment protocol and outcomes, a belief in greater effectiveness of a high-dose vaccine might have little influence on delivery of treatment or ascertainment of outcomes. Nevertheless, blinding of alternative vaccine doses would eliminate any potential effect of expectations or preferences while introducing no burden or distortion. Consequently, both participants and clinicians were blinded to group assignment.
Question 5: How might concealing treatment allocation from participants or clinicians obscure meaningful differences between interventions?When alternative treatments have inherent differences in mode or complexity of delivery (including required frequency of visits or laboratory monitoring), blinding could require significant distortion of one or both treatments. A comparison of those altered treatments could yield results not generalizable to real-world settings where treatments would eventually be delivered.
As an example, the InterSePT trial15 compared the risk of suicidal behavior in participants with schizophrenia or schizoaffective disorder treatment with clozapine or olanzapine. Given the risk of agranulocytosis, treatment with clozapine required frequent laboratory monitoring. Requiring similar monitoring in both groups would have significantly distorted typical treatment with olanzapine. Consequently, neither participants nor clinicians were blinded, and no artificial conditions were imposed on the olanzapine group.
Question 6: How might procedures necessary to conceal treatment allocation from participants and/or providers impact the acceptability or risk of trial participation?In addition to reducing generalizability, altering treatments to maintain blinding may reduce overall desirability of trial participation or introduce unnecessary risks.
As an example, in the PRIDE trial11, 12 comparing oral and long-acting injectable antipsychotic medication, blinding participants and clinicians would have required participants in both groups to both receive monthly injections and use daily oral medication. In such a scenario, the burden on trial participants would be greater than the burden of either treatment in real-world practice. In addition, requiring monthly placebo injections for those assigned to oral medication would create nontrivial risk. These considerations contributed to the choice of an open-label treatment protocol, with each treatment delivered as it would be in everyday practice.
ADVANTAGES AND DISADVANTAGES OF STANDARDIZING TREATMENT QUALITY OR INTENSITYTraditional clinical trials typically compare highly standardized treatments delivered by expert providers in specialized treatment settings. This level of treatment standardization aims to reduce variation in treatment quality, maximizing precision to detect true differences between alternative treatments. In this paradigm, variation in quality or fidelity of treatment would be considered noise rather than signal.
Standardization of treatment, however, may sometimes obscure meaningful differences between treatments that would occur in the real-world settings where trial results would be applied. Naturally occurring variation in treatment might generate signal rather than noise when alternative treatments differ in the resources or expertise required for optimal delivery, the level of adherence necessary for clinical effectiveness, or the burden on participants in terms of administration and monitoring. The advantages of less “demanding” treatments may only emerge in the less standardized conditions of real-world practice.
Standardization of treatment or follow-up care, however, may sometimes be necessary to protect participant safety. Consequently, some artificial standardization may be necessary in pragmatic trials, even when that standardization might reduce generalizability of findings.
Whether strict standardization of treatment reveals or obscures true differences between treatments under study depends on the specific characteristics of the treatments, study participants, and study settings. We describe below a series of specific questions to inform or guide design decisions controlling or restricting treatment quality, illustrating with examples from recent real-world studies.
QUESTIONS TO INFORM CHOICES REGARDING STANDARDIZATION OF TREATMENT Question 1: How much would the effectiveness or safety of the study treatment(s) vary among providers or care settings and how is this variability related to different levels of resources, experience, or expertise?If a pragmatic trial aims to evaluate effectiveness and safety under in real-world practice, then it is necessary to consider current variation in practice and to predict how a new treatment might actually be implemented (Table 2). In many cases, the resources and expertise available in trial settings exceed those available in community settings where trial results will be applied. The relative effectiveness or safety of alternative treatments may vary according to the expertise with which they are delivered.
Table 2. Considerations regarding standardization of treatment Favors more standardized treatment Favors more naturalistic or variable treatment Treatment effectiveness or safety are not expected to vary among clinicians or clinical settings Treatment effectiveness or safety varies according to available clinical resources or expertise Standardized study treatment more likely to match treatment in settings where results will be applied Naturalistic or variable study treatment more likely to match treatment in settings where results will be applied Standardization of treatment necessary for valid inference regarding safety or effectiveness Standardization would obscure differences in safety or effectiveness likely to occur in subsequent real-world care Standardization of treatment necessary to protect vulnerable participants or assure participant safety Standardization of treatment not necessary to protect vulnerable participants or assure participant safetyAs an example, the ROCKET AF trial16 compared rivaroxaban and warfarin for prevention of stroke or thromboembolic event in participants with atrial fibrillation, enrolling participants at 1,178 clinical sites in 45 countries. Consistent with good clinical practice, the study protocol called for adjustment of warfarin dosing to maintain international normalized ratio (INR) values within an optimal range. Sites were expected to have a range of expertise in the management of warfarin treatment. As expected, sites varied considerably in the proportion of time that warfarin-treated participants had INR values in that optimal range. Despite this expected real-world variability, between-group differences in effectiveness were similar across sites with higher and lower rates of optimal treatment.
Question 2: What level(s) of resources/experience/expertise are now present in the care settings in which results of this trial will be applied?When effectiveness or safety would be expected to vary according to the expertise or fidelity with which a treatment is delivered, pragmatic trial investigators should consider the expected practice patterns in settings where trial results would be applied. Matching study treatment with expectations regarding real-world implementation may involve either selection of study sites and/or transparent reporting of variation across study sites.
As an example, the RECOVERY platform trial17, 18 evaluating alternative treatments for COVID-19 aimed to rapidly inform treatment in hospital settings overwhelmed by the pandemic. Strict treatment protocols requiring on-site research staff and significant alterations in care processes were simply not feasible—either in the trial settings or in other similar settings where trial results would be applied. Consequently, study protocols were intentionally flexible, both to facilitate implementation during the trial and maximize generalizability of trial findings to real-world practice.
Question 3: What level(s) of resources/experience/expertise are now present in the care settings in which this trial could be conducted?Ideally, clinical sites or settings for a pragmatic clinical trial should resemble those in which trial results would eventually be applied. When selecting study sites or evaluating generalizability of results, generators and consumers of RWE may use available data to assess relevant aspects of current practice.
As an example, the Salford Lung Study19 evaluated the efficacy and safety of an innovative combination inhaler (containing fluticasone furoate and vilanterol) to standard inhalers for preventing exacerbations of asthma and chronic obstructive pulmonary disease. Study treatments were delivered in 66 primary clinics of Salford and South Manchester. Including all primary care clinics and pharmacies in a defined geographic area helped assure that study treatment would resemble that in practice settings where trial results would be applied.
Question 4: What special vulnerabilities or risks are anticipated in the study population?Maximizing generalizability of study results to community practice usually argues for allowing natural variation in study treatments. In some cases, however, standardization of treatment is necessary to protect vulnerable populations or avoid specific risks.
As an example,: the InterSePT trial15 comparing clozapine and olanzapine for the prevention of suicidal behavior enrolled participants with schizophrenia and either a history of suicide attempt or recent suicidal ideation. Consequently, development of protocols for both clozapine and olanzapine treatment considered appropriate monitoring for suicide risk or other clinical decompensation.
Question 5: Is there some minimal level of treatment standardization necessary for valid inference regarding the study question?Although allowing more variability in quality or fidelity of treatment might improve generalizability to community practice, some minimal level of treatment quality or treatment intensity may be necessary for valid inference regarding the treatments under study. Consequently, treatment protocols in pragmatic trials may require some minimum qualifications for participating clinicians or other mechanisms for assuring a required level of treatment quality.
As an example, in the Salford Lung Study,19 all study treatments were provided by community pharmacies and managed by community primary care physicians. In order to assure adequate delivery of study treatment, all participating pharmacies and physicians received training in good clinical practice and training regarding the novel combination inhaler under study.
Question 6: Is there some minimal or base level of treatment quality necessary to assure participant safety?Even when standardization of treatment is not necessary for valid inference, some level of standardization or quality control may be necessary to protect vulnerable trial participants. Monitoring and assuring participant safety may require a higher level of standardization than is typical in settings where trial results would be applied.
As an example, given that the InterSePT trial15 participants were at high risk for suicidal behavior, the study protocol called for more frequent visits than would be typical for patients treated with olanzapine. Although those more frequent visits could obscure differences between treatments in medication effects on suicidal behavior, more frequent monitoring was considered necessary to protect vulnerable study participants.
SUMMARY AND CONCLUSIONSWhereas traditional randomized trials typically examine highly standardized treatments delivered under blinded conditions, both blinding and standardization of treatment may decrease efficiency of evidence generation and/or generalizability of evidence to real-world practice. Pragmatic or real-world clinical trials often involve open label treatment and greater flexibility in treatment delivery. Because those departures from traditional clinical trial practice have the potential to undermine either trial validity or participant safety, designers of pragmatic trials should carefully assess when open-label treatment and/or more flexible treatment protocols are appropriate. We describe a series of specific questions to inform those decisions.
ACKNOWLEDGMENTThe authors would like to thank Dr. Carolyn Shore of the National Academies of Sciences, Engineering, and Medicine for her support of this project.
CONFLICT OF INTERESTDr. Simon is an employee of Kaiser Permanente Washington. Dr. Horberg is an employee of Kaiser Permanente Mid-Atlantic. Dr. Califf is an employee of Verily Life Sciences and Google Health and is a Board member for Cytokinetics. The content does not officially represent the views of their employers. All other authors declared no competing interests for this work.
1Corrigan-Curay, J., Sacks, L. & Woodcock, J. Real-world evidence and real-world data for evaluating drug safety and effectiveness. JAMA 320, 867– 868 (2018). 2Eapen, Z.J., Lauer, M.S. & Temple, R.J. The imperative of overcoming barriers to the conduct of large, simple trials. JAMA 311, 1397– 1398 (2014). 3Sherman, R.E. et al. Real-world evidence - what is it and what can it tell us? N. Engl. J. Med. 375, 2293– 2297 (2016). 4 Forum on Drug Discovery Development and Translation | National Academies <https://www.nationalacademies.org/our-work/forum-on-drug-discovery-development-and-translation>. 5 National Academies of Sciences, Engineering, and Medicine; Health and Medicine Division; Board on Health Sciences Policy; Forum on Drug Discovery, Development, and Translation; Forstag, E.H., Kahn, B., Wagner Gee, A. & Shore, C. (eds). Examining the Impact of Real-World Evidence on Medical Product Development: Proceedings of a Workshop Series ( National Academies Press, Atlanta, GA, 2019). 6Thorpe, K.E. et al. A pragmatic–explanatory continuum indicator summary (PRECIS): a tool to help trial designers. J. Clin. Epidemiol. 62, 464– 475 (2009). 7Day, S.J. & Altman, D.G. Blinding in clinical trials and other studies. BMJ 321, 504 (2000). 8Veerus, P., Fischer, K., Hemminki, E., Hovi, S.-L. & Hakama, M. Effect of characteristics of women on attendance in blind and non-blind randomised trials: analysis of recruitment data from the EPHT Trial. BMJ Open 6, e011099 (2016). 9Bull-Otterson, L. et al. Hydroxychloroquine and chloroquine prescribing patterns by provider specialty following initial reports of potential benefit for COVID-19 treatment — United States, January–June 2020. Morb. Mortal. Wkly. Rep. 69, 1210– 1215 (2020). 10Jones, P., Kafonek, S., Laurora, I. & Hunninghake, D. Comparative dose efficacy study of atorvastatin versus simvastatin, pravastatin, lovastatin, and fluvastatin in patients with hypercholesterolemia (the CURVES Study). Am. J. Cardiol. 81, 582– 587 (1998). 11Alphs, L., Mao, L., Rodriguez, S.C., Hulihan, J. & Starr, H.L. Design and rationale of the Paliperidone Palmitate Research in Demonstrating Effectiveness (PRIDE) study: a novel comparative trial of once-monthly paliperidone palmitate versus daily oral antipsychotic treatment for delaying time to treatment failure in persons with schizophrenia. J. Clin. Psychiatry 75, 1388– 1393 (2014). 12Alphs, L. et al. Real-world outcomes of paliperidone palmitate compared to daily oral antipsychotic therapy in schizophrenia: a randomized, open-label, review board-blinded 15-month study. J. Clin. Psychiatry 76, 554– 561 (2015). 13Vardeny, O. et al. High-dose influenza vaccine to reduce clinical outcomes in high-risk cardiovascular patients: Rationale and design of the INVESTED trial. Am. Heart J. 202, 97– 103 (2018). 14Vardeny, O. et al. Effect of high-dose trivalent vs standard-dose quadrivalent influenza vaccine on mortality or cardiopulmonary hospitalization in patients with high-risk cardiovascular disease: a randomized clinical trial. JAMA 325, 39– 49 (2021). 15Meltzer, H.Y. et al. Clozapine treatment for suicidality in schizophrenia: international suicide prevention trial (InterSePT). Arch. Gen. Psychiatry 60, 82– 91 (2003). 16Bansilal, S. et al. Efficacy and safety of rivaroxaban in patients with diabetes and nonvalvular atrial fibrillation: The Rivaroxaban Once-daily, Oral, Direct Factor Xa Inhibition Compared with Vitamin K Antagonism for Prevention of Stroke and Embolism Trial in Atrial Fibrillation (ROCKET AF Trial). Am. Heart J. 170, 675– 682.e8 (2015). 17 RECOVERY Collaborative Group et al. Effect of hydroxychloroquine in hospitalized patients with Covid-19. N. Engl. J. Med. 383(21), 2030- 2040 (2020). 18 RECOVERY Collaborative Group. Dexamethasone in Hospitalized Patients with Covid-19 — Preliminary Report. N. Engl. J. Med. 384, 693- 704 (2021). 19Bakerly, N.D. et al. The Salford Lung Study protocol: a pragmatic, randomised phase III real-world effectiveness trial in chronic obstructive pulmonary disease. Respir. Res. 16, 101 (2015).
留言 (0)