Data-Driven Identification of Factors That Influence the Quality of Adverse Event Reports: 15-Year Interpretable Machine Learning and Time-Series Analyses of VigiBase and QUEST


IntroductionBackground

Pharmacovigilance (PV) is the science and activities related to the detection, assessment, understanding, and prevention of adverse effects or any other possible drug-related problems []. Individual case safety reports (ICSRs) of suspected adverse drug reactions and adverse events following immunization (hereafter collectively referred to as adverse events [AEs]) collected in spontaneous reporting systems (SRSs) remain the cornerstone of postmarketing drug safety surveillance [,] (see for a list of definitions [-]).

Over 170 participating countries in the World Health Organization (WHO) Programme for International Drug Monitoring (PIDM) share reports of suspected AEs and collaborate worldwide in monitoring and identifying signals of AEs []. The WHO PIDM signal detection process is anchored on data recorded in the WHO global ICSR database, VigiBase, developed and maintained by the WHO Collaborating Centre for International Drug Monitoring, Uppsala Monitoring Centre (UMC), Sweden. Common technical specifications for report transmission and standard terminologies for drugs and reactions have evolved over the years to facilitate global information sharing and efficient analysis [,]. Currently, VigiBase accepts 3 standard formats: the original International Drug Information System (INTDIS) and 2 revisions of the International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (ICH) Guidelines for the Electronic Transmission of ICSRs, namely E2B(R2), and the latest E2B(R3), with all data being transformed to a format most closely resembling E2B(R2) in VigiBase [].

The participating members are characterized by diverse contexts—sociocultural, political, and clinical—that affect the measures in which reports are collected and processed, as well as their quality []. Ideally, a robust PV system should consider all data quality parameters, including accuracy, completeness, conformity, consistency, currency, duplication, integrity, precision, relevance, and understandability. Among all these parameters, problems associated with completeness (ie, missing data) have long been regarded as critical factors hampering the usefulness of existing reports [,]. Influxes of poorly documented reports could increase operational burdens, upsurge a system’s resources, and even mask or delay the detection of drug safety signals [].

While only 4 elements (identifiable patient, identifiable reporter, medicinal product, and AE) are required for a valid report, they are often insufficient for productive analyses of potential causal relationships between medicinal products and AEs []. In 2014, the UMC developed the vigiGrade completeness score, an automated multidimensional tool that measures the amount of clinically relevant information in reports essential for causality assessment, replacing the 4-grade WHO documentation grading scheme since the 1990s [,]. The vigiGrade score quantifies report completeness based on a selection of ICH-E2B fields: time to onset, indication, event outcome, patient age and sex, dose information, country of origin, reporter, type of report, and free-text fields. The vigiGrade score can be used to pinpoint trends in report quality over time and reflect systematic data quality issues in collections of reports from member countries. For instance, vigiGrade uncovered miscoded age units in US reports and missing AE outcomes in Italian reports []. The score may also guide reviewers in judging whether the information in a report suffices for a problem to be investigated []. Notably, vigiGrade has proven to be an indicator of a true signal and is part of the data-driven predictive model used by the UMC, vigiRank, for signal detection [].

The PV System in Malaysia

PV activities in Malaysia began in the 1980s with the establishment of the Malaysian Adverse Drug Reactions Advisory Committee (MADRAC) under the Drug Control Authority (DCA) []. The Malaysian national PV center is based within the National Pharmaceutical Regulatory Agency (NPRA) under the Pharmaceutical Services Programme of the Ministry of Health (MOH). Malaysia became a member of the WHO PIDM in 1990 and is regarded as an established PV center, receiving >30,000 reports annually, which is well above the WHO criteria of 200 reports per million inhabitants per year since 2009. Every AE report recorded in the national PV database (QUEST; see [] for a detailed description) is carefully processed and assessed by trained pharmacists at the national center and subsequently reviewed by the MADRAC before submission to the UMC for inclusion in VigiBase (Figure S1 in ).

Problem Statement and Research Benefits

Previous studies have not evaluated the quality of Malaysian AE reports, and little is known about the underlying factors affecting their vigiGrade completeness scores. However, identifying and validating factors associated with report quality was made difficult by the large number and variety of potentially correlated characteristics of a spontaneous report—at the reaction, drug, patient, reporter, sender, or regulator level (see the literature review on AE report quality in [,-]). As of 2019, Malaysian reports demonstrated a 5-year average completeness score of 0.79, surpassing the global average of approximately 0.44 in VigiBase and approaching the benchmark for well-documented reports (0.80). Therefore, this study primarily aimed to use a hypothesis-free, data-driven approach to explore the main drivers influencing the completeness of Malaysian reports in VigiBase over a 15-year period using vigiGrade. A secondary objective was to understand the strategic measures taken by the Malaysian authorities that preceded the relatively high completeness score across different time frames. A better understanding of the drivers of AE report completeness may be helpful for the NPRA and regulators worldwide.


MethodsData-Driven Framework for Identifying Factors Associated With AE Report QualityOverview

Our study used big data analysis approaches incorporating machine learning (ML) methods, which are becoming increasingly prevalent in clinical and epidemiological research [,]. These approaches aimed to overcome limitations inherent in traditional approaches in handling complex interactions among variables (eg, multicollinearity and nonlinearity). Importantly, ML methods focus on identifying patterns and associations within complex data rather than on establishing causal inference []. For clarity, we have outlined the similarities in concepts and nomenclatures between ML and traditional medical statistics in .

Hybrid Feature Selection

Our study leveraged ML methods for feature selection, mitigating human bias in analyzing extensive report characteristics, which might be overlooked by traditional hypothesis-driven approaches that are prone to high selection biases []. We combined statistical filtering and ML algorithms to preselect features, reducing overfitting risks (often arising from redundancy and multicollinearity) and computational costs [,]. Notably, Stevens et al [] used random forest (RF)–based feature selection to identify potential risk factors associated with cardiovascular diseases. In almost all domains, incorporating domain expertise remains vital for developing meaningful and effective models []. [,,-] provides detailed explanations.

Interpretable ML

Post hoc explanation methods such as Shapley additive explanations (SHAP) have been increasingly used to provide interpretability for complex black-box models such as RF []. Van den Bosch et al [] used regression coefficients and SHAP value analyses to identify risk factors associated with 30-day mortality among patients undergoing colorectal cancer surgery. Gong et al [] also developed an ML framework for acute kidney injury prediction and interpretation using SHAP values to assess feature contributions and identify specific patient impacts. [-] provides detailed explanations.

Study Design

This observational study used interpretable ML and descriptive time-series analyses of PV database data. Fundamentally, it was an exploratory data analysis assessing a large number of report characteristics without prespecified hypotheses [] aiming to identify factors influencing report quality. The main steps of our methodological workflow are illustrated in .

Figure 1. Overview of study workflow. Data Collection

AE report data were obtained from VigiBase in CSV format. Supplementary information, such as means of reporting, sender type, and sender region, was retrieved from the NPRA QUEST3+ (latest version) database in CSV format. For the secondary objective of understanding the key drivers of AE report completeness, strategies and interventions implemented in Malaysia were collected from the literature, official websites, and feedback from the NPRA. We included all reports recorded in VigiBase as of February 2021; reported in Malaysia; and received by the NPRA from January 1, 2005, to December 31, 2019. This 15-year range was chosen for its relevance to current PV needs and to enable the timely identification of necessary improvements. We excluded reports that (1) were suspected duplicates identified by the UMC’s vigiMatch [] (see for operational definitions), (2) were not sourced from Malaysia, (3) had null average completeness scores, and (4) lacked drugs marked as suspected or interacting.

Study VariablesDependent Variables or Outcomes

The vigiGrade completeness score (C; ranges from 0.07 to 1) was classified as well documented (C>0.8) or not well documented (C≤0.8; see for operational definitions).

Independent Variables or Explanatory Features

The variables related to administrative, sender, reporter, patient, drug, and reaction characteristics are presented in Table S1 in .

Data PreprocessingData Mapping

Supplementary data from QUEST3+ were mapped to the primary data set from VigiBase using the primary identifier. Given the distinct differences in reporting elements of INTDIS and E2B formats (input values vary in certain data fields), we divided the data set for separate analysis.

Data Cleaning and Feature Extraction

We cleaned and engineered the features from the available data based on the literature, domain knowledge, and previous experience. Information about the WHO Anatomical Therapeutic Chemical (ATC) classification system codes was provided by the UMC in the data set. If an active ingredient was linked to more than one code, an ATC level-2 code was manually assigned based on indication, route of administration, dosage, product information, and clinical narratives. Reporting qualifications in INTDIS format were harmonized with the E2B format with reference to supplementary data from the NPRA. We calculated the number of suspected or interacting drugs, concomitant drugs, and reactions for each report. We also included the annual staffing level of the national PV center in Malaysia and the means of reporting (based on report identifier).

Missing Values

Continuous variables consisting of null values were converted to categorical variables based on data distribution and domain knowledge. Missing values for categorical variables were grouped as a null category.

Data Splitting

To ensure a consistent distribution of target classes, we applied stratified random sampling. We allocated 90% of the data for training, which underwent 10-fold cross-validation, and reserved 10% for testing to gauge model performance on unseen samples. Our approach prioritized the extraction of insights from the current data set rather than overgeneralizations on future data.

Transformation and Encoding

To overcome data complexity and maximize interpretability, data at the drug event level were transformed to the case (report) level. Observations related to concomitant drugs were excluded as the vigiGrade scoring method is restricted to drugs listed as suspected or interacting []. We took the average value of a case for continuous variables whereby, for categorical variables, we examined the presence (or absence) of a particular drug- or event-related characteristic. Continuous variables were standardized. Binary categorical variables such as patient sex were integer encoded. One-hot encoding was performed on the remaining categorical variables, including ordinal variables and categories labelled as null or unknown. In the following sections, we distinguish variables from features, where the latter correspond to the processed variables in a binary fashion for the ML model input [,].

Multivariable ML AnalysisFeature Selection

We performed hybrid feature selection to eliminate redundant or less informative features before data mining using the ML algorithm. To avoid data leakage and the corresponding model overfitting, we conducted a 2-stage feature selection solely based on training data [,]. We first applied the univariable filter method to independently assess and preselect the features and subsequently selected the top-ranked features using RF-based recursive feature elimination coupled with multicollinearity assessment. The detailed processes are provided in .

Modeling and Validation

We applied a supervised ML method to identify key features relevant to the reports classified as well documented. Specifically, the RF classifier was selected for its robustness to nonparametric distributions, nonlinearity, and outliers [] and its out-of-the-box performance. Its built-in feature importance metrics allowed us to assess the relative attribution of a feature to the classification task. The more a feature is used to make key decisions with the forest of decision trees, the higher its relative importance. To mitigate class imbalance in the INTDIS data set, we used RandomUnderSampler with a 0.25 ratio that achieved optimal balanced performance of prediction and recall. We chose undersampling over synthetic sampling methods to preserve the real-world data characteristics. For the imbalanced INTDIS data set, we adjusted the class_weight parameter in the RF classifier to balanced. We evaluated the RF classification models using 10-fold cross-validation.

Performance Evaluations

Classification performance was measured using the area under the receiver operating characteristic curve, accuracy, recall (sensitivity), and precision (positive predictive values). For the imbalanced INTDIS data set, F1-scores (harmonic average of precision and recall) were reported.

Feature Explanations

To mitigate the issue of black-box predictions, we used TreeExplainer [] to generate SHAP summary plots that succinctly display the magnitude, prevalence, and direction of a feature’s effect by measuring each feature’s attributions to the classification. In SHAP, the feature effect is a measure of how much the value of a specific feature influences the prediction made by the model.

Software and Packages

All ML analyses were developed in Jupyter Notebook (Project Jupyter) using Python (version 3.7.9; Python Software Foundation). Statistical tests were performed using pandas (version 1.2.4) and statsmodels (version 0.12.2) []. ML analysis was completed using the scikit-learn package (version 0.24.2) []. SHAP values were calculated using TreeExplainer [].

Time-Series and Descriptive Statistical Analysis

We used time-series analysis and descriptive statistics to evaluate the trends in report quality and the characteristics associated with well-documented reports over different time frames. One-way ANOVA or 2-tailed Student t tests were conducted on continuous variables, whereas the chi-square or Fisher exact test was used to compare categorical variables, as appropriate. A P value of <.05 was considered statistically significant. All analyses were conducted using the SAS software (version 9.4; SAS Institute).

Ethical Considerations

This study was registered with the Malaysian National Medical Research Register (NMRR-20-983-53984 [Investigator Initiated Research]) and received ethics approval from the Medical Review and Ethics Committee, MOH, Malaysia (reference: KKM/NIHSEC/P20–1144(4)).


ResultsOverview

We analyzed the completeness of Malaysian AE reports in VigiBase received by the NPRA over 15 years. A total of 132,738 reports were included in the analysis following the predefined inclusion and exclusion criteria (Figure S2 in ). Table S1 in summarizes the characteristics of the INTDIS and E2B reports included in this study concerning administration, reporter, patient, drug, and reaction by status of being well documented. Among the included reports, 48.17% (63,943/132,738) were in the INTDIS format, and 51.83% (68,795/132,738) were in the E2B format. Over two-thirds (46,186/68,795, 67.14%) of E2B reports were well documented compared to 16.72% (10,691/63,943) of INTDIS reports.

Multivariable ML AnalysisSelected Features

For the INTDIS subsets, 90 features were preselected using univariate filter methods and further narrowed down to 33 features following RF-based recursive feature elimination ranking and multicollinearity assessment. For the E2B subsets, 90 features were preselected and subsequently reduced to 40.

Classification Performance

The performance of the RF models in classifying reports as well or not well documented is presented in .

Table 1. Classification performance of random forest model for the training (10-fold cross-validation) and test set.
Recall (%)Precision (%)Accuracy (%)AUROCa (%)F1-score (%)INTDISb
Training, mean (SD)99.6 (0.03)95.4 (0.13)99.0 (0.03)99.8 (0.01)97.4 (0.06)
Validation, mean (SD)73.7 (1.90)77.0 (1.02)90.3 (0.39)95.0 (0.33)75.3 (1.16)
Test74.974.391.595.174.6E2B
Training, mean (SD)99.7 (0.02)99.7 (0.02)99.6 (0.01)99.9 (0.001)99.7 (0.01)
Validation, mean (SD)96.9 (0.27)91.6 (0.34)92.0 (0.24)94.9 (0.29)94.2 (0.20)
Test96.990.991.495.193.8

aAUROC: area under the receiver operating characteristic curve.

bINTDIS: International Drug Information System.

Top Factors Predicting Status of Malaysian Reports Being Well DocumentedRF ML Model

Figure S3 in reveals the top-ranked features that contributed to the status of INTDIS and E2B reports being well documented derived from the RF ML model’s built-in feature importance metrics. However, the directions of their contribution were not known due to the black-box nature of the RF model. For INTDIS reports received between 2005 and 2016, PV center staffing was identified as the most important factor in predicting their status of being well documented. Other important factors included suspected drug withdrawal, the number of reactions reported, reaction abated upon drug dechallenge, and patient sex. Reaction abated upon drug dechallenge, on the other hand, appeared to be the most important factor predicting whether an E2B report received between 2015 and 2019 was well documented. Reports from other health care professionals (HCPs), reactions occurring <1 day, reports submitted by product registration holders (PRHs), and the number of concomitant drugs were also among the top 5 important factors.

SHAP Interpretation Method

The SHAP post hoc interpretations provided us with an understanding of the magnitude, prevalence, and direction of feature effects. depicts rich summaries of individual attributions for all features, allowing us to discover key factors that influence well-documented reports. Features with higher global importance have a greater influence on the model’s predictions. A feature with predominantly red dots to the right (eg, reaction abated upon drug dechallenge in B) implies a positive contribution to well-documented reports, whereas a negative contribution is indicated if the direction is to the left (eg, reports submitted by PRHs in B). Features with lower global importance but a long tail stretching in one direction indicate a rare but high-magnitude effect [,]. As mean SHAP values are calculated across all cases, a feature with a lower impact but higher prevalence may have a higher SHAP value []. For the least globally important features, we observed that their feature effects were not constant across cases, with blue and red dots dispersed in both directions. This variation may arise from interactions with other features that modulate their importance in different cases [].

Figure 2. (A) Top 33 features for the International Drug Information System (INTDIS) subset from 2005 to 2016; (B) top 40 features for the E2B subset during the years 2015 to 2019. The Shapley additive explanation (SHAP) bar plot illustrates global feature importances based on mean absolute SHAP values, highlighting the impact of each feature on the model’s predictions. Higher values represent greater influence. The waterfall plot indicates the cumulative contribution of features to the model. The SHAP summary plot of local explanations displays each observation as a dot, with its position on the x-axis (SHAP value) indicating the impact of a feature on the model’s classification for that observation. Continuous features (marked with an asterisk) range from low (blue) to high (red) values, whereas categorical features of a binary nature are either absent (blue) or present (red). The distribution of dots indicates the magnitude and prevalence of a feature effect. Features are ordered by global importance. ATC: Anatomical Therapeutic Chemical; HCP: health care professional; HLGT: Medical Dictionary for Regulatory Activities High-Level Group Term; NEC: not elsewhere classified; PhIS: pharmacy hospital information system; PRH: product registration holder; PT: Medical Dictionary for Regulatory Activities Preferred Term; PV: pharmacovigilance; UMC: Uppsala Monitoring Centre; UMC calculated seriousness: serious cases classified automatically by a UMC-developed algorithm.

Regarding INTDIS reports, in earlier years, PV center staffing was the primary factor driving the Malaysian rate of reports being well documented, with this factor alone accounting for >35% of the model’s explainability. The next most important factor favoring well-documented reports was drug withdrawal, followed by a duration of drug use of <1 week, reaction abated upon drug dechallenge, and reaction occurring <1 day. In contrast, an increased number of reported reactions and reports from pharmacists predicted not well-documented INTDIS reports.

In more recent years, the most important factor favoring well-documented E2B reports from Malaysia was reaction abated upon drug dechallenge, which alone was responsible for >25% of the model’s explainability. Among the top 25 features, which provided 90% of the model’s interpretation on classifying status of being well documented, 6 (24%) were found to be negatively associated with well-documented reports: reports submitted by PRHs; reports made by other HCPs; reactions under the Medical Dictionary for Regulatory Activities (MedDRA) High-Level Group Terms (HLGTs) product quality, supply, distribution, manufacturing, and quality system issues; reports received during the QUEST3+ pilot stage, reports received via web reporting, and adolescent patients (aged 12-17 years). E2B reports that involved reactions with a shorter time to onset and duration of drug use were more likely to be well documented. Other identified key drivers of Malaysian well-documented E2B reports were reports made by pharmacists, reports submitted from public specialist hospitals, pharmacy hospital information system (PhIS)–integrated reporting, and the involvement of systemic antimicrobials (ATC code J01).

Time-Series and Descriptive Statistical AnalysisTrend Analysis of Malaysian AE Report Quality (2005-2019)

depicts the time trends in AE reporting in Malaysia, illustrating how both the quantity and completeness scores of AE reports received by the NPRA grew over a 15-year period from 2005 to 2019. In and , we summarize the trends divided into 5-year subperiods, further stratified by sender type and reporter qualification. Of the total 132,738 reports received, 56,877 (42.84%) were well documented. Before 2014, the average completeness score consistently fell short of 0.5 but was slightly above the global average of 0.44. The volume of reports surged by 121% from 2013 to 2014, whereas the proportion of well-documented reports rose from practically 0% to 18.93% (2843/15,013). Since 2015, more than half (53,929/82,497, 65.37%) of the Malaysian reports were well documented, averaging 0.79 (SD 0.23) over the last 5 years, with a new high of 0.82 in 2019.

Figure 3. Distribution of average completeness scores and counts of Malaysian reports by status of being well documented in VigiBase over the study period. Table 2. The 5-year stratified summary statistics of overall Malaysian report quality.
Total2005-20092010-20142015-2019P valueaOverall reports, n (%)132,738 (100)11,458 (8.6)38,783 (29.2)82,497 (62.2)N/AbCompleteness, mean (SD)0.68 (0.25)0.47 (0.15)0.51 (0.18)0.79 (0.23)<.001Well-documented reports (Cc>0.8), n (%)56,877 (42.8)105 (0.9)2843 (7.3)53,929 (65.4)<.001

aP value based on ANOVA.

bN/A: not applicable.

cvigiGrade completeness score.

Table 3. The 5-year stratified summary statistics of well-documented Malaysian reports (N=56,877).
Total, n (%)2005-2009 (n=105), n (%)2010-2014 (n=2843), n (%)2015-2019 (n=53,929), n (%)P valueaSender type<.001
Public specialist hospital31,503 (55.4)72 (68.6)1542 (54.2)29,889 (55.4)

Public nonspecialist hospital4954 (8.7)3 (2.9)217 (7.6)4734 (8.8)

Public clinic15,609 (27.4)4 (3.8)866 (30.5)14,739 (27.3)

Other public services3 (0)0 (0)1 (0)2 (0)

University hospital496 (0.9)9 (8.6)25 (0.9)462 (0.9)

Private PRHb939 (1.7)11 (10.5)58 (2)870 (1.6)

Private hospital or clinic2114 (3.7)6 (5.7)106 (3.7)2002 (3.7)

Private community pharmacy45 (0.1)0 (0)0 (0)45 (0.1)

Consumer30 (0.1)0 (0)0 (0)30 (0.1)

Unknown1184 (2.1)0 (0)28 (1)1156 (2.1)
Reporter qualification<.001
Physician11,446 (20.1)53 (50.5)488 (17.2)10,905 (20.2)

Pharmacist40,295 (70.8)16 (15.2)1850 (65.1)38,429 (71.3)

Other HCPc4084 (7.2)36 (34.3)500 (17.6)3548 (6.6)

Consumer78 (0.1)0 (0)0 (0)78 (0.1)

Unknown974 (1.7)0 (0)5 (0.2)969 (1.8)

aP value based on the Fisher exact test.

bPRH: product registration holder.

cHCP: health care professional.

Over the 15 years, most well-documented reports in Malaysia came from public health facilities, with public specialist hospitals contributing more than half (31,503/56,877, 55.38%). Public clinics emerged as key contributors in later stages, with well-documented reports increasing considerably from 3.8% (4/105) in the period from 2005 to 2009 to 30.46% (866/2843) in the following 5 years. Compared to public services, the private sector consistently demonstrated a marginal contribution to quality AE reporting in Malaysia. In the earlier years, physicians contributed approximately half (53/105, 50.5%) of the well-documented reports. In the subsequent periods, reports from pharmacists showed a rise in quantity and average completeness, yielding the highest overall rate of being well documented (1850/2843, 65.07% to 38,429/53,929, 71.26%) among all reporter types from 2010 to 2019.

Key Strategies and Interventions Implemented in Malaysia (2005-2019)

In Malaysia, various strategies and interventions were implemented over the 15 years with the intent of improving AE reporting, as summarized by 5-year period in [,-]. While the impacts of most interventions are usually multifaceted at the national level and challenging to measure with limited quantitative information, there is a particular interest in understanding the influence of staffing levels at the PV center, the introduction of a new PV database, and enhancements to reporting tools on reporting quality at different time points.

Figure 4. Key strategies and interventions implemented to improve adverse event (AE) reporting in Malaysia between 2005 and 2019. CPD: continuing professional development; DIS: drug information service; HCP: health care professional; PhIS: pharmacy hospital information system; PRH: product registration holder; PRP: provisionally registered pharmacist; PV: pharmacovigilance; RiMUP: risalah maklumat ubat untuk pengguna.

shows the annual trends in PV center staffing in relation to rates of reports being well documented. depicts how the transition in report submission format (from INTDIS to E2B) and reporting means correlated with report quantity and average completeness. In , we focus on the rates of reports being well documented before and after the implementation of the new PV database (QUEST3+) and key enhancements to reporting tools since 2015. Information about reporting means was not available for INTDIS reports collected from the historical QUEST2 database. We further examined the influence and popularity of different reporting means among various reporters following the official launch of QUEST3+ and new web reporting tools in the first quarter of 2017 (Figure S4 in ).

Figure 5. Annual trends in pharmacovigilance (PV) center staffing levels and rate of reports being well documented (2005-2019). Figure 6. Mean completeness score and report count by submission format, pharmacovigilance database, and means of reporting yearly from 2013 to 2019. The size of the bubble corresponds to the report count. INTDIS: International Drug Information System; PhIS: pharmacy hospital information system. Figure 7. Rate of reports being well documented by submission format, pharmacovigilance database, and means of reporting quarterly from 2013 to 2019. 95% CI error bars (equivalent to 1.96×SE) were constructed. INTDIS: International Drug Information System; PhIS: pharmacy hospital information system; Q: quarter. Average Completeness of Individual Dimensions for Malaysian AE Reports

illustrates the trends in the average completeness of the individual dimensions for Malaysian reports in VigiBase. The dimensions for report type, primary source country, reporter qualification, and patient age and sex were consistently the most completed. Completeness of free text and drug indication improved from zero in the earlier period of 2005 to 2009 to >0.9 in recent years. An uptrend in improvements was also observed for reaction onset. Completeness for reaction outcome dipped in 2011 to 2012 but subsequently rebounded to >0.8. Unexpectedly, we observed a noteworthy drop in drug dosage completeness since 2010. The average completeness of each individual dimension for vigiGrade was evaluated for E2B subsets (Figure S5 in ).

Figure 8. Trends in the distribution of average completeness scores for Malaysian reports in VigiBase over the study period. The shaded area indicates overall completeness; the line represents the individual dimension.
DiscussionOverview of Principal Findings

In our study, we used a comprehensive approach to examine the factors influencing the quality of Malaysian AE reports and decipher the temporal trends and interventions that shaped the Malaysian PV landscape from 2005 to 2019. In the first part of our study, by harnessing a data-driven approach that encompassed ML-based feature selection and analysis, we identified the key features that predict the status of Malaysian reports in VigiBase being well documented. Our hybrid feature selection helped in mitigating the risks of overfitting and unstable interpretability that are commonly associated with high variance [,,] and resulted in a robust and valid RF model, as evidenced by the excellent classification performance (>90%) across all training, validation, and test sets for the E2B subset that reflects recent patterns of Malaysian reports (). While the model for the highly imbalanced INTDIS subset (containing only 10,691/63,943, 16.72% of well-documented reports) demonstrated satisfactory performance, with recall, precision, and F1-scores of >70%, the model faced overfitting. This issue, evident from the decreased validation and test performance, has been acknowledged as a limitation in our study. The black-box RF model was made interpretable using SHAP values that, in agreement with human intuition, did not contradict the findings from the time-series analyses. To the best of our knowledge, this is the first study to use state-of-the-art interpretable ML methods to obtain insights concerning the key factors contributing to AE report quality in the PV field. Supplemented with insights drawn from our time-series and descriptive analyses, we summarized the identified features associated with well-documented Malaysian reports under 3 main themes: administrative; sender and reporter; and reaction, drug, and patient.

In the second part of our study, our extensive time-series and descriptive analyses illuminated the notable progress Malaysia has made in both the quantity and quality of AE reports over the years. We delved further into the chronological trends and characteristics of Malaysian AE reports, outlined the key strategies and interventions implemented at 5-year intervals, and tracked the influence of interventions of interest. These findings offer valuable insights into the multifaceted strategies and interventions that have driven enhancements in Malaysian AE reporting quality. Finally, by focusing on individual dimensions for vigiGrade completeness scoring, we not only gained additional insights into the specific characteristics and aspects of report completeness within the Malaysian context but also pinpointed systematic data quality issues that warrant further attention and improvement work.

Factors and Characteristics Associated With Well-Documented Malaysian ReportsAdministrative

Our ML analysis distinguished PV center staffing level as the most important factor positively associated with well-documented INTDIS reports over 12 years between 2005 and 2016 (Figure S3 in and A). Previous studies [-] have also highlighted manpower at national PV centers as a barrier to functional and sustainable PV systems. During the initial stage (2005-2009), only 2 to 4 staff members handled all PV operations in Malaysia (). As the number of reports grew rapidly, the center struggled with a backlog of reports. The center swiftly grew from 8 to 21 staff members within 3 years (2012-2015) alongside the expansion of PV functions and task specializations, which coincided with a notable improvement in the completeness of reports received ( and ). Compared to the period from 2005 to 2009, more training workshops were delivered from 2010 to 2019 () to enhance staff and reporter competencies. As the staffing level and rate of reports being well documented became relatively stable afterward (), the influence of PV staffing levels appeared less distinctive for E2B reports (B). This could be due to some staff members focusing on other PV duties such as signal detection and assessment, risk management, and risk communication rather than AE processing. It is worth noting that the transition to E2B may have also obfuscated the influence of increased staffing on report completeness. While less distinctive in our model, we cannot exclude that other centers with E2B reports and low staffing may still face overall low report completeness.

The capabilities and scalability of PV databases are essential for effective data collection and management []. Integrating electronic reporting into hospital information systems has been reported to reduce duplicate work and effectively improve AE reporting []. In Malaysia, the new QUEST3+ database was later integrated with the AE reporting module of a centralized PhIS implemented at Malaysian public hospitals and clinics, enabling the automatic input of reporter and sender information and improved reporter accessibility to patient, drug use, and regulatory information (eg, product registration number and batch number). Our ML analysis of E2B subsets (B) revealed that PhIS-integrated reporting positively contributed to well-documented Malaysian reports, whereas web reporting had an unexpected negative association. The volume of PhIS reports surpassed other means in 2019 ( and Figure S4 in ) and maintained the highest reporting quality. Web reporting was initially introduced in 2000, but due to unstable systems and slowly developing IT infrastructures at some public health facilities in the last decade, most reports before 2015 were submitted manually via postage, fax, or email. After mandatory fields were added, web reporting recorded a rate of >70% of reports being well documented, but it later declined to 60%. A similar but larger declining trend was also observed with manual reporting (). These declines corresponded to the shift to PhIS reporting by public-sector pharmacists and were likely associated with other HCPs and consumers (Figure S4 in ). Our findings highlight that electronic reporting tools serve as ad hoc support for well-trained reporters but IT infrastructure maturity and widespread user acceptance are required for success. This is in line with a recent realist review [] asserting that technological interventions alone, without capacity building, have little or no impact on health data quality.

Sender and Reporter

Our study revealed that the greatest proportion of well-documented Malaysian reports, amounting to 91.54% (52,066/56,877), originated from public health facilities overseen by the MOH (). Among the well-documented reports, 98.15% (55,825/56,877) were submitted by HCPs. This finding aligns with those of previous studies [-] conducted across different regions of Malaysia, which consistently highlight that most HCPs recognize AE reporting as part of their professional obligations. In Malaysia, consumer-generated reports constituted only 0.14% (78/56,877) of the well-documented reports, which stands in contrast to countries such as Denmark and Norway where three-fifths of well-documented reports came from consumers or non-HCPs [].

Our observations revealed that Malaysian pharmacists generated 70.85% (40,295/56,877) of well-documented reports (), with most serving the public health sector []. Specifically, we identified public specialist hospitals and pharmacists as key features that positively contributed to the well-documented Malaysian E2B reports (B). It is noteworthy that, during the earlier stage (2005-2009), pharmacists’ reports exhibited the poorest quality and were recognized as a key factor predicting not well-documented INTDIS reports from 2005 to 2016 (A). Nonetheless, over the years, following increased recruitment in public health services [,] and multifaceted initiatives aimed at strengthening pharmacists’ roles and skills (), pharmacists have become an integral part of AE monitoring in Malaysia. Almost 9 in 10 public hospital pharmacists in Malaysia have reported at least one AE in the past [].

Malaysia presents a unique scenario in which pharmacists play a leading role in PV activities, distinguishing it from global trends, in which physicians contribute nearly two-thirds of well-documented reports []. The Malaysian context aligns with findings from a Spanish study in which pharmacists reported a great majority of the AEs due to the integration of PV into routine hospital pharmacy practices []. Similar observations were made in a pharmacist-led AE monitoring and management model in China, where pharmacists provided higher-quality reports among all HCPs []. Within Malaysian public hospitals, the pharmacist-led Drug Information Service (DIS) unit is responsible for facility-level PV activities, including responding to queries related to AEs; disseminating safety information; and compiling, verifying, and submitting AE reports to the NPRA [,]. In addition to direct detection and reporting by pharmacists, a collaborative mechanism exists within public health facilities where physicians and other HCPs are aware of the role of pharmacists in monitoring and reporting the AEs detected during clinical rounds or discussions. Moreover, we observed that over half (19,188/33,559, 57.18%) of the well-documented E2B reports made by pharmacists came from public specialist hospitals. These reports were believed to have benefited from the input of specialist physicians, suggesting a positive contribution of collaborative efforts among HCPs in enhancing the quality of AE reports.

In contrast, reports from PRHs and other HCPs (including regulatory affairs officers, clinical trial associates, nurses, and medical assistants) were flagged as the key features negatively associated with Malaysian report completeness (B). These findings are consistent with the features observed in the United States [], Brazil [], Spain [], South Korea [], and Japan [,]. Of note, the NPRA classified the reports from PRHs as reported by other HCPs when the primary reporter was unknown. Among 7833 E2B reports from PRHs, only 778 (9.93%) reports were well documented (Table S1 in ), with an overall completeness score of 0.39 (Figure S5 in ). Information regarding drug dosage, reaction onset, reaction outcome, and patient age was most incomplete in Malaysian reports from PRHs. This could be attributed to the lack of a robust PV culture among PRHs. Existing literature [,] suggests that PRHs might prioritize submitting a report to fulfill pharmaceutical legislation [] that mandates that PRHs report any suspected AEs within strict timelines even when minimal information is available. There could also be instances in which the primary reporter did not provide consent for follow-up. Conversely, it is conceivable that pharmacists and physicians serving at health facilities were more motivated to make a clinically meaningful report even on a voluntary basis [,,].

Reaction, Drug, and Patient

As AE reporting is highly dependent on individual motivation [], we were interested in understanding whether the nature of drugs and reactions affects the quality of reports. While a report may involve more than one drug or reaction, most studies on AE report quality have not assessed all drugs and reactions reported in a case. Toki and Ono [] examined only the primary suspected drug in a multivariable logistic regression model, whereas Araujo et al [] and Masuka and Khoza [] evaluated a specific drug group using simple univariable analysis. Other studies have evaluated only case-level information, such as case seriousness [,,-], fatal outcome [], and causality []. As we converted drug-reaction pairs to case-level data, we evaluated the influence of drug- and reaction-related factors on overall report completeness for all reported suspected and interacting drugs.

Our ML analysis of the E2B subset (B) revealed that a case where the reaction abated following a drug dechallenge (ie, positive dechallenge) was the primary key feature associated with a well-documented report. While information on drug dechallenge was unknown in most cases (Table S1 in ), it is possible that a positive dechallenge may have strengthened the reporter’s confidence in the drug-reaction causal relationship and, thus, the motivation to construct a clinically meaningful report. While our findings suggest that positive dechallenge may have motivated more complete reports, there is no supporting study on this. It is important to note that reports that are well documented by vigiGrade standards might also tend to have a positive dechallenge. In other words, it may be that vigiGrade tends to flag those reports that have a positive dechallenge as well documented.

As expected, cases containing information on time to onset were more likely to be well documented as the onset dimension incurs the highest penalty of 50% for missing data and 30% if the uncertainty exceeds 1 month []. Our findings suggest that cases with a shorter (ie, <1 day) time to onset, dosing interval, and duration of drug use were most likely to be well documented. This observation might be attributable to better recall and description of events occurring within a shorter time frame following drug use or to greater reporter confidence in the drug-reaction relationship due to stronger temporal association and a lower likelihood of confounding factors. On the other hand, reports that involved reactions lasting 1 to 6 days tended to carry more information compared to those that involved reactions lasting <1 day or >6 days. A competing hypothesis is that reactions occurring within this time frame allow for sufficient time for more observation or data gathering while still being easily observed and described by patients and HCPs. Nevertheless, further research is needed to determine the specific factors contributing to the observed differences in report quality for cases with varying time to onset, dosing intervals, and durations of drug use.

Antibiotics for systemic use (ATC code J01) emerged as a key feature favorably contributing to well-documented E2B reports. First, it could be attributed to the baseline reporting patterns, where systemic antibiotics were the most commonly reported drug group in Malaysia, with over one-fifth of AE reports involving at least one systemic antibiotic (Table S1 in ). Second, this observation might suggest that Malaysian HCPs exercise heightened caution when using anti-infectives, leading to a higher likelihood of detecting AEs related to anti-infectives with higher report completeness. According to a study from a Malaysian infectious disease hospital [], most inquiries (37.8%) received by the DIS unit concerned anti-infective drugs (ATC code J), which included other β-lactam antibacterials (ATC code J01D), direct-acting antivirals (ATC code J05A), and penicillins (ATC code J01C), with the largest proportion of the inquiries pertaining to their AEs and pediatric dosage adjustments. Although this observation could be expected in an infectious disease hospital, it also highlights the role of pharmacist-led DIS in AE monitoring and reinforces our previous discussion regarding pharmacists working in public health facilities tending to submit more complete reports. Trainings by the NPRA often prioritized pharmacists working in DIS units, who then conducted echo training for HCPs in their respective health facilities.

In addition, our analysis revealed a positive association between reports marked as serious and well-documented Malaysian reports (B), consistent with previous studies from France [,], China [,], and South Korea [] that highlighted higher completeness for serious reports. Previous research has also indicated that Malaysian HCPs prioritize reporting serious AEs []. The heightened gravity and potential consequences of these cases might prompt reporters to exercise greater diligence in ensuring reporting quality, including PRHs who are subjected to stricter reporting timelines for serious cases. Fatal outcomes were not flagged as the key feature contributing to Malaysian reports being well documented, likely due to their low prevalence (1048/68,795, 1.52%; Table S1 in ). Nonetheless, Malaysian fatal reports had a low overall completeness of 0.55 compared to 0.80 for nonfatal reports (Figure S5 in ). Another observational study evaluating reports submitted to the US Food and Drug Administration [] also found that cases of patient deaths had the lowest completeness scores across reporting sources. This could be attributed to the absence of medical terminology describing the cause of death or indicate an investigation into a potential drug involvement.

Reactions under the HLGTs product quality, supply, distribution, manufacturing, and quality system issues, of which 97.78% (2242/2293) were related to product substitution issues and 91.41% (2096/2293) were sourced from public health facilities primarily by pharmacists, were captured as the key feature that negatively contributed to E2B reports being well documented. Subsequent investigations revealed that product substitution issues were most prevalent in 2018 (1355/2242, 60.44%) and primarily involved brand switching between 2 generic products: amlodipine (1012/2242, 45.14%) and perindopril (29

留言 (0)

沒有登入
gif