Adverse Event Signal Detection Using Patients’ Concerns in Pharmaceutical Care Records: Evaluation of Deep Learning Models

Introduction

Increasing numbers of people are expected to develop cancers in our aging society [-]. Thus, there is increasing interest in how to detect and manage the side effects of anticancer therapies in order to improve treatment regimens and patients’ quality of life [-]. The primary approaches for side effect management are “early signal detection and early intervention” [-]. Thus, more efficient approaches for this purpose are needed.

It has been recognized that patients’ voices concerning adverse events represent an important source of information. Several studies have indicated that the number, severity, and time of occurrence of adverse events might be underevaluated by physicians [-]. Thus, patient-reported outcomes (PROs) have recently received more attention in the drug evaluation process, reflecting patients’ real voices. Various kinds of PRO measures have been developed and investigated in different disease populations [,]. Health care authorities have also encouraged the pharmaceutical industry to use PROs for drug evaluation [,], and it is becoming more common to take PRO assessment results into consideration for drug marketing approval [,]. Similar trends can be seen in the clinical management of individual patients. Thus, health care professionals have an interest in understanding how to appropriately gather patients’ concerns in order to improve safety management and clinical decisions [-].

The applications of deep learning for natural language processing have expanded dramatically in recent years []. Since the development of a high-performance deep learning model in 2018 [], attempts to apply cutting-edge deep learning models to various kinds of patient-generated text data for the evaluation of safety events or the analysis of unscalable subjective information from patients have been accelerating [-]. Most studies have been conducted to use patients’ narrative data for pharmacovigilance [,-], while few have been aimed at improvement of real-time safety monitoring for individual patients. In addition, there have been some studies on adverse event severity grading based on health care records [-], but none has yet aimed to extract clinically important adverse event signals that require medical intervention from patients’ narratives. It is important to know whether deep learning models could contribute to the detection of such important adverse event signals from concern texts generated by individual patients.

To address this question, we have developed deep learning models to detect adverse event signals from individual patients with cancer based on patients’ blog articles in online communities, following other types of natural language processing–related previous work [,]. One deep learning model focused on the specific symptom of hand-foot syndrome (HFS), which is one of the typical side effects of anticancer treatments [], and another focused on a broad range of adverse events that impact patients’ activities of daily living []. We showed that our models can provide good performance scores in targeting adverse event signals. However, the evaluation relied on patients’ narratives from the patients’ blog data used for deep learning model training, so further evaluation is needed to ensure the validity and applicability of the models to other texts regarding patients’ concerns. In addition, the blog data source did not contain medical information, so it was not feasible to assess whether the models could contribute to the extraction of clinically important adverse event signals.

To address these challenges, we focused on pharmaceutical care records written by pharmacists at community pharmacies. The gold standard format for pharmaceutical care records in Japan is the SOAP (subjective, objective, assessment, plan)-based document that follows the “problem-oriented system” concept proposed by Weed [] in 1968. Pharmacists track patients’ subjective concerns in the S column, provide objective information or observations in the O column, give their assessment from the pharmacist perspective in the A column, and suggest a plan for moving forward in the P column [,]. We considered that SOAP-based pharmaceutical care records could be a unique data source suitable for further evaluation of our deep learning models because they contain both patients’ concerns and professional health care records by pharmacists, including the medication prescription history with time stamps. Therefore, this study was designed to assess whether our deep learning models could extract clinically important adverse event signals that require intervention by medical professionals from these records. We also aimed to evaluate the characteristics of the models when applied to patients’ subjective information noted in the pharmaceutical care records, as there have been only a few studies on the application of deep learning models to patients’ concerns recorded during pharmacists’ daily work [-].

Here, we report the results of applying our deep learning models to patients’ concern text data in pharmaceutical care records, focusing on patients receiving anticancer treatment.

MethodsData Source

The original data source was 2,276,494 pharmaceutical care records for 303,179 patients, created from April 2020 to December 2021 at community pharmacies belonging to the Nakajima Pharmacy Group in Japan []. To focus on patients with cancer, records of patients with at least 1 prescription for an anticancer drug were retrieved by sorting individual drug codes (YJ codes) used in Japan (YJ codes starting with 42 refer to anticancer drugs). Records in the S column (ie, S records) were collected from the patients with cancer as the text data of patients’ concerns for deep learning model analysis.

Deep Learning Models

The deep learning models used for this research were those that we constructed based on patients’ narratives in blog articles posted in an online community and that showed the best performance score in each task in our previous work (ie, a Bidirectional Encoder Representations From Transformers [BERT]–based model for HFS signal extraction [] and a T5-based model for adverse event signal extraction []). BERT [] and T5 [] both belong to a type of deep learning model that has recently shown high performance in several studies [,]. Hereafter, we refer to the deep learning model for HFS signals as the HFS model, the model for any adverse event signals as All AE (ie, all or any adverse events) model, and the model for adverse event signals limited to patients’ activities of daily living as the AE-L (adverse events limiting patients’ daily lives) model. It was also confirmed that these deep learning models showed similar or higher performance scores for the HFS, All AE, or AE-L identification tasks using 1000 S records randomly extracted from the data source of this study compared to the values obtained in our previous work [,] (the performance scores of sentence-level tasks from our previous work are comparable, as the mean number of words in the sentences in the data source in our previous work was 32.7 [SD 33.9], which is close to that of the S records used in this study, 38.8 [SD 29.4]). The method and results of the performance-level check are described in detail in [,]. We applied the deep learning models to all text data in this study without any adjustment in setting parameters from those used in constructing them based on patient-authored texts in our previous work [,].

Evaluation of Extracted S Records by the Deep Learning Models

In this study, we focused on the evaluation of S records that our deep learning models extracted as HFS or AE-L positive. Each positive S record was assessed as if it was a true adverse event signal, a sort of adverse event symptom, whether or not an intervention was made by health care professionals. We also investigated the kind of anticancer treatment prescription in connection with each adverse event signal identified in S records.

To assess whether an extracted positive S record was a true adverse event signal, we used the same annotation guidelines as in our previous work []. In brief, each S record was treated as an “adverse event signal” if any untoward medical occurrence happened to the patient, regardless of the cause. For the AE-L model only, if a positive S record was confirmed as an adverse event signal, it was further categorized into 1 or more of the following adverse event symptoms: “fatigue,” “nausea,” “vomiting,” “diarrhea,” “constipation,” “appetite loss,” “pain or numbness,” “rash or itchy,” “hair loss,” “menstrual irregularity,” “fever,” “taste disorder,” “dizziness,” “sleep disorder,” “edema,” or “others.”

For the assessment of interventions by health care professionals and anticancer treatment prescriptions, information from the O, A, and P columns and drug prescription history in the data source were investigated for the extracted positive S records. The interventions by health care professionals were categorized in any of the following: “adding symptomatic treatment for the adverse event signal,” “dose reduction or discontinuation of causative anticancer treatment,” “consultation with physician,” “others,” or “no intervention (ie, just following up the adverse event signal).” The actions categorized in “others” were further evaluated individually. For this assessment, we also randomly extracted 200 S records and evaluated them in the same way for comparison with the results from the deep learning model. Prescription history of anticancer treatment was analyzed by primary category of mechanism of action (MoA) with subcategories if applicable (eg, target molecule for kinase inhibitors).

Applicability Check to Other Text Data Including Patients’ Concerns

To check the applicability of our deep learning models to data from a different source, interview transcripts from patients with cancer were also evaluated. The interview transcripts were created by the Database of Individual Patient Experiences-Japan (DIPEx-Japan) []. DIPEx-Japan divides the interview transcripts into sections for each topic, such as “onset of disease” and “treatment,” and posts the processed texts on its website. Processing is conducted by accredited researchers based on qualitative research methods established by the University of Oxford []. In this study, interview text data created from interviews with 52 patients with breast cancer conducted from January 2008 to October 2018 were used to assess whether our deep learning models can extract adverse event signals from this source. In total, 508 interview transcripts were included with the approval of DIPEx-Japan.

Ethical Considerations

This study was conducted with anonymized data following approval by the ethics committee of the Keio University Faculty of Pharmacy (210914-1 and 230217-1) and in accordance with relevant guidelines and regulations and the Declaration of Helsinki. Informed consent specific to this study was waived due to the retrospective observational design of the study with the approval of the ethics committee of the Keio University Faculty of Pharmacy. To respect the will of each individual stakeholder, however, we provided patients and pharmacists of the pharmacy group with an opportunity to refuse the sharing of their pharmaceutical care records by posting an overview of this study at each pharmacy store or on their web page regarding the analysis using pharmaceutical care records. Interview transcripts from DIPEx-Japan were provided through a data sharing arrangement for using narrative data for research and education. Consent for interview transcription and its sharing from DIPEx-Japan was obtained from the participants when the interviews were recorded.

ResultsData Set

From the original data source of 2,180,902 pharmaceutical care records for 291,150 patients, S records written by pharmacists for patients with a history of at least 1 prescription of an anticancer drug were extracted. This yielded 30,784 S records for 2479 patients with cancer (). The mean and median number of words in the S records were 38.8 (SD 29.4) and 32 (IQR 20-50), respectively. We applied our deep learning models, HFS, All AE, and AE-L, to these 30,784 S records for the evaluation of the deep learning models for adverse event signal detection.

For interview transcripts created by DIPEx-Japan, the mean and median number of words were 428.9 (SD 160.9) and 416 (IQR 308-526), respectively, in the 508 transcripts for 52 patients with breast cancer.

Table 1. Characteristics of the pharmaceutical care records.ItemValuesDurationApril 2020 to December 2021 (1 year and 9 months)Patients
Total, n291,150
At least 1 anticancer drug prescription, n (%)2479 (0.9)Pharmaceutical care records
SOAPa records for all patients, n2,180,902
S records for patients with at least 1 anticancer drug prescription, n (%)30,784 (1.4)Words in Sb records for patients with at least 1 anticancer drug prescription
Mean (SD)38.8 (29.4)
Median (IQR)32 (20-50)

aSOAP: subjective, objective, assessment, plan.

bS: subjective.

Application of the HFS ModelOverview

First, we applied the HFS model to the S records for patients with cancer. The BERT-based model was used for this research as it showed the best performance score in our previous work [].

S Records Extracted as HFS Positive

The S records extracted as HFS positive by the HFS model () amounted to 167 (0.5%) records for 119 (4.8%) patients. A majority of the patients had 1 HFS-positive record in their S records (n=91, 76.5%), while 2 patients had as many as 6 (1.7%) HFS-positive records. When we examined whether the extracted S records were true adverse event signals or not, 152 records were confirmed to be adverse event signals, while the other 15 records were false-positives. All the false-positive S records were descriptions about the absence of symptoms or confirmation of improving condition (eg, “no diarrhea, mouth ulcers, or limb pain so far” or “the skin on the soles of my feet has calmed down a lot with this ointment”). Some examples of S records that were predicted as HFS positive by the model are shown in Table S1 in .

The same examination was conducted with interview transcripts from DIPEx-Japan. Only 1 (0.2%) transcript was extracted as HFS positive by the HFS model, and it was a true adverse event signal (100%). The actual transcript extracted as HFS positive is shown in Table S2 in .

Table 2. Sa records extracted as HFSb positive by the HFS model.ItemValuesPositive patients (n=2479), n (%)119 (4.8)Positive S records (n=30,784), n (%)167 (0.5)
Positive S records per patient, mean (SD)1.40 (0.92)Breakdown of patient numbers by number of positive S records, n (%)
1 record91 (76.5)
2 records18 (15.1)
3 records4 (3.4)
4 records4 (3.4)
6 records2 (1.7)Adverse event signal, n (%)
Yes152 (91)
No (ie, false-positivec)15 (9)

aS: subjective.

bHFS: hand-foot syndrome.

cAll false-positive S records were denial of symptoms or confirmation of improving condition.

Interventions by Health Care Professionals

The 167 S records extracted as HFS positive as well as 200 randomly selected records were checked for interventions by health care professionals (). The proportion showing any action by health care professionals was 64.1% for 167 HFS-positive S records compared to 13% for the 200 random S records. Among the actions taken for HFS positives, “adding symptomatic treatment” was the most common, accounting for around half (n=79, 47.3%), followed by “other” (n=18, 10.8%). Most “other” actions were educational guidance from pharmacists, such as instructions on moisturizing, nail care, or application of ointment and advice on daily living (eg, “avoid tight socks”).

‎

Figure 1. Interventions in cases with HFS-positive S records and in random cases. HFS: hand-foot syndrome; HP: health care professional. Anticancer Drugs Prescribed

The types of anticancer drugs prescribed for HFS-positive patients are summarized based on the prescription histories in . For the 152 adverse event signals identified by the HFS model in the previous section, the most common MoA class of anticancer drugs used for the patients was antimetabolite (n=62, 40.8%), specifically fluoropyrimidines (n=59, 38.8%). Kinase inhibitors were next (n=49, 32.2%), with epidermal growth factor receptor (EGFR) inhibitors and multikinase inhibitors as major subgroups (n=28, 18.4% and n=14, 9.2%, respectively). The third and fourth most common MoAs were aromatase inhibitors (n=24, 15.8%) and antiandrogen or estrogen drugs (n=7, 4.6% each) for hormone therapy.

Table 3. MoA classes of anticancer drugs prescribed for patients with adverse event signals identified by the HFS model (n=152).Anticancer drugsAdverse event signals identified by HFS model, n (%)Antimetabolites
Overall62 (40.8)
Fluoropyrimidine59 (38.8)
Folate analog3 (2)Kinase inhibitors
Overall49 (32.2)
EGFRa28 (18.4)
Multi14 (9.2)
VEGFb6 (3.9)
HER2c1 (0.7)Aromatase inhibitors24 (15.8)Antiandrogens7 (4.6)Antiestrogens7 (4.6)CDK4/6d inhibitors3 (2)Alkylating agents1 (0.7)

aEGFR: epidermal growth factor receptor.

bVEGF: vascular endothelial growth factor.

cHER2: human epidermal growth factor receptor-2.

dCDK4/6: cyclin-dependent kinase 4/6.

Application of the All AE or AE-L modelOverview

The All AE and AE-L models were also applied to the same S records for patients with cancer. The T5-based model was used for this research as it gave the best performance score in our previous work [].

S Records Extracted as All AE or AE-L positive

The numbers of S records extracted as positive were 7604 (24.7%) for 1797 patients and 196 (0.6%) for 142 patients for All AE and AE-L, respectively. In the case of All AE, patients tended to have multiple adverse event positives in their S records (n=1315, 73.2% of patients had at least 2 positives). In the case of AE-L, most patients had only 1 AE-L positive (n=104, 73.2%), and the largest number of AE-L positives for 1 patient was 4 (2.8%; ).

We focused on AE-L evaluation due to its greater importance from a medical viewpoint and lower workload for manual assessment, considering the number of positive S records. Of the 197 AE-L–positive S records, it was confirmed that 157 (80.1%) records accurately extracted adverse event signals, while 39 (19.9%) records were false-positives that did not include any adverse event signals (). The contents of the 39 false-positives were all descriptions about the absence of symptoms or confirmation of improving condition, showing a similar tendency to the HFS false-positives (eg, “The diarrhea has calmed down so far. Symptoms in hands and feet are currently fine” and “No symptoms for the following: upset in stomach, diarrhea, nausea, abdominal pain, abdominal pain or stomach cramps, constipation”). Examples of S records that were predicted as AE-L positive are shown in Table S3 in .

The deep learning models were also applied to interview transcripts from DIPEx-Japan in the same manner. The deep learning models identified 84 (16.5%) and 18 (3.5%) transcripts as All AE or AE-L positive, respectively. Of the 84 All AE–positive transcripts, 73 (86.9%) were true adverse event signals. The false-positives of All AE (n=11, 13.1%) were categorized into any of the following 3 types: explanations about the disease or its prognosis, stories when their cancer was discovered, or emotional changes that did not include clear adverse event mentions. With regard to AE-L, all the 18 (100%) positives were true adverse event signals (Table S4 in ). Examples of actual transcripts extracted as All AE or AE-L positive are shown in Table S5 in .

Table 4. Sa records extracted as positive by the All AEb or AE-Lc model.ItemValuesAll AE positive
Positive patients (n=2479), n (%)1797 (72.5)
Positive S records (n=30,784), n (%)7604 (24.7)

Positive S records per patient, mean (SD)4.23 (4.17)
Breakdown of patient numbers by number of positive S records, n (%)

1 record482 (26.8)

2-5 records849 (47.3)

6-10 records337 (18.8)

11-20 records114 (6.3)

>20 records15 (0.8)AE-L positive
Positive patients (n=2479), n (%)142 (5.7)
Positive S records (n=30,784), n (%)196 (0.6)

Positive S records per patient, mean (SD)1.38 (0.72)
Breakdown of patient numbers by number of positive S records, n (%)

1 record104 (73.2)

2 records26 (18.3)

3 records8 (5.6)

4 records4 (2.8)
Adverse event signal, n (%)

Yes157 (80.1)

No (ie, false-positived)39 (19.9)

aS: subjective.

bAll AE: all (or any of) adverse event.

cAE-L: adverse events limiting patients’ daily lives.

dAll false-positive S records were denial of symptoms or confirmation of improving condition.

Interventions by Health Care Professionals

Whether or not interventions were made by health care professionals was investigated for the 196 AE-L–positive S records. As in the HFS model evaluation, data from 200 randomly selected S records were used for comparison (). In total, 91 (46.4%) records in the 196 AE-L–positive records were accompanied by an intervention, while the corresponding figure in the 200 random records was 26 (13%) records. The most common action in response to adverse event signals identified by the AE-L model was “adding symptomatic treatment” (n=71, 36.2%), followed by “other” (n=11, 5.6%). “Other” included educational guidance from pharmacists, inquiries from pharmacists to physicians, or recommendations for patients to visit a doctor.

‎

Figure 2. Interventions in cases with AE-L-positive S records and in random cases. AE-L: adverse events limiting patients’ daily lives; HP: health care professional. Anticancer Drugs Prescribed

The types of anticancer drugs prescribed for patients with adverse event signals identified by the AE-L model were summarized based on the prescription histories (). In connection with the 157 adverse event signals, the most common MoA of the prescribed anticancer drug was antimetabolite (n=62, 39.5%) and fluoropyrimidine (n=53, 33.8%), which accounted for the majority. Kinase inhibitor (n=31, 19.7%) was the next largest category with multikinase inhibitor (n=14, 8.9%) as the major subgroup. These were followed by antiandrogen (n=27, 17.2%), antiestrogen (n=10, 6.4%), and aromatase inhibitor (n=10, 6.4%) for hormone therapy.

Table 5. MoA classes of anticancer drugs prescribed for patients with adverse event signals identified by the AE-L model (n=157).Anticancer drugsAdverse event signals identified by AE-L model, n (%)Antimetabolites

Overall62 (39.5)
Fluoropyrimidine53 (33.8)
Trifluridine5 (3.2)
Purine analog3 (1.9)
Folate analog1 (0.6)
Ribonucelotide reductase inhibitor1 (0.6)Kinase inhibitors

Overall31 (19.7)
Multi14 (8.9)
EGFRa9 (5.7)
JAKb3 (1.9)
VEGFc2 (1.3)
BTKd2 (1.3)
FLT3e1 (0.6)Antiandrogens27 (17.2)Antiestrogens10 (6.4)Aromatase inhibitors10 (6.4)Topoisomerase inhibitors6 (3.8)PARPf inhibitors6 (3.8)CDK4/6g inhibitors4 (2.5)Alkylating agents1 (0.6)Anti-CD20h antibodies1 (0.6)

aEGFR: epidermal growth factor receptor.

bJAK: janus kinase.

cVEGF: vascular endothelial growth factor.

cHER2: human epidermal growth factor receptor-2.

dBTK: bruton tyrosine kinase.

eFLT3: FMS-like tyrosine kinase-3.

fPARP: poly-ADP ribose polymerase.

gCDK4/6: cyclin-dependent kinase 4/6.

hCD20: cluster of differentiation 20.

Adverse Event Symptoms

For the 157 adverse event signals identified by the AE-L model, the symptoms were categorized according to the predefined guideline in our previous work []. “Pain or numbness” (n=57, 36.3%) accounted for the largest proportion followed by “fever” (n=46, 29.3%) and “nausea” (n=40, 25.5%; ). Symptoms classified as “others” included chills, tinnitus, running tears, dry or peeling skin, and frequent urination. When comparing the proportion of the symptoms associated with or without interventions by health care professionals, a trend toward a greater proportion of interventions was observed in “fever,” “nausea,” “diarrhea,” “constipation,” “vomiting,” and “edema” (, black boxes). On the other hand, a smaller proportion was observed in “pain or numbness,” “fatigue,” “appetite loss,” “rash or itchy,” “taste disorder,” and “dizziness” (, gray boxes).

Table 6. Symptoms of adverse event signals identified by the AE-L model from subjective records (n=157).SymptomsAdverse event signals identified by the AE-L model, n (%)Pain or numbness57 (36.3)Fever46 (29.3)Nausea40 (25.5)Fatigue23 (14.6)Rash or itchy17 (10.8)Appetite loss16 (10.2)Diarrhea16 (10.2)Constipation11 (7)Vomiting8 (5.1)Taste disorder7 (4.5)Dizziness6 (3.8)Edema1 (0.6)Menstrual irregularity0 (0)Hair loss0 (0)Sleep disorder0 (0)Others21 (13.4)‎

Figure 3. Symptoms of adverse event signals identified by the AE-L model from subjective records by intervention (n=157). AE-L: adverse events limiting patients’ daily lives.
DiscussionOverview

This study was designed to evaluate our deep learning models, previously constructed based on patient-authored texts posted in an online community, by applying them to pharmaceutical care records that contain both patients’ subjective concerns and medical information created by pharmacists. Based on the results, we discuss whether these deep learning models can extract clinically important adverse event signals that require medical intervention, and what characteristics they show when applied to data on patients’ concerns in pharmaceutical care records.

Performance for Adverse Event Signal Extraction

The first requirement for the deep learning models is to extract adverse event signals from patients’ narratives precisely. In this study, we evaluated the proportion of true adverse event signals in positive S records extracted by the HFS or AE-L model. True adverse event signals amounted to 152 (91%) and 157 (80.1%) for the HFS and AE-L models, respectively ( and ). Given that the proportion of true adverse event signals in 200 randomly extracted S records without deep learning models was 54 (27%; categories other than “no adverse event” in and ), the HFS and AE-L models were able to concentrate S records with adverse event mentions. Although 15 (9%) for the HFS model and 39 (19.9%) for the AE-L model were false-positives, it was confirmed all of the false-positive records described a lack of symptoms or confirmation of improving condition. We considered that such false-positives are due to the unique feature of pharmaceutical care records, where pharmacists might proactively interview patients about potential side effects of their medications. As the data set of blog articles we used to construct the deep learning models included few such cases (especially comments on lack of symptoms), our models seemed unable to exclude them correctly. Even though we confirmed that the proportion of true “adverse event” signals extracted from the S records by the HFS or AE-L model was more than 80%, the performance scores to extract true “HFS” or “AE-L” signals were not so high based on the performance check using 1000 randomly extracted S records (F1-scores were 0.50 and 0.22 for true HFS and AE-L signals, respectively; Table S1 in ). It is considered that the performance to extract true HFS and AE-L signals was relatively low due to the short length of texts in the S records, providing less context to judge the impact on patients’ daily lives, especially for the AE-L model (the mean word number of the S records was 38.8 [SD 29.4; ], similar to the sentence-level tasks in our previous work [,]). However, we consider a true adverse event signal proportion of more than 80% in this study represents a promising outcome, as this is the first attempt to apply our deep learning models to a different source of patients’ concern data, and the extracted positive cases would be worthy of evaluation by a medical professional, as the potential adverse events could be caused by drugs taken by the patients.

When the deep learning models were applied to DIPEx-Japan interview transcripts, including patients’ concerns, the proportion of true adverse event signals was also more than 80% (for All AE: n=73, 86.9% and for HFS and AE-L: n=18, 100%). The difference in the results between pharmaceutical care S records and DIPEx-Japan interview transcripts was the features of false-positives, descriptions about lack of symptoms or confirmation of improving condition in S records versus explanations about disease or its prognosis, stories about when their cancer was discovered, or emotional changes in interview transcripts. This is considered due to the difference in the nature of the data source; the pharmaceutical care records were generated in a real-time manner by pharmacists through their daily work, where adverse event signals are proactively monitored, while the interview transcripts were purely based on patients’ retrospective memories. Our deep learning models were able to extract true adverse event signals with an accuracy of more than 80% from both text data sources in spite of the difference in their nature. When looking at future implementation of the deep learning models in society (discussed in the Potential for Deep Learning Model Implementation in Society section), it may be desirable to further adjust deep learning models to reduce false-positives depending upon the features of the data source.

Identification of Important Adverse Events Requiring Medical Intervention

To assess whether the models could extract clinically important adverse event signals, we investigated interventions by health care professionals connected with the adverse event signals that are identified by our deep learning models. In the 200 randomly extracted S records, only 26 (13%) consisted of adverse event signals, leading to any intervention by health care professionals. On the other hand, the proportion of signals associated with interventions was increased to 107 (64.1%) and 91 (46.4%) in the S records extracted as positive by the HFS and AE-L models, respectively ( and ). These results suggest that both deep learning models can screen clinically important adverse event signals that require intervention from health care professionals. The performance level in screening adverse event signals requiring medical intervention was higher in the HFS model than in the AE-L model (n=107, 64.1% vs n=91, 46.4%; and ). Since the target events were specific and adverse event signals of HFS were narrowly defined, which is one of the typical side effects of some anticancer drugs, we consider that health care providers paid special attention to HFS-related signals and took action proactively. In both deep learning models, similar trends were observed in actions taken by health care professionals in response to extracted adverse event signals; common actions were attempts to manage adverse event symptoms by symptomatic treatment or other mild interventions, including educational guidance from pharmacists or recommendations for patients to visit a doctor. More direct interventions focused on the causative drugs (ie, “dose reduction or discontinuation of anticancer treatment”) amounted to less than 5%; 7 (4.2%) for the HFS model and 6 (3.1%) for the AE-L model ( and ). Thus, it appears that our deep learning models can contribute to screening mild to moderate adverse event signals that require preventive actions such as symptomatic treatments or professional advice from health care providers, especially for patients with less sensitivity to adverse event signals or who have few opportunities to visit clinics and pharmacies.

Ability to Catch Real Side Effect Signals of Anticancer Drugs

Based on the drug prescription history associated with S records extracted as HFS or AE-L positive, the type and duration of anticancer drugs taken by patients experiencing the adverse event signals were investigated. For the HFS model, the most common MoA of anticancer drug was antimetabolite (fluoropyrimidine: n=59, 38.8%), followed by kinase inhibitors (n=49, 32.2%, of which EGFR inhibitors and multikinase inhibitors accounted for n=28, 18.4% and n=14, 9.2%, respectively) and aromatase inhibitors (n=24, 15.8%; ). It is known that fluoropyrimidine and multikinase inhibitors are typical HFS-inducing drugs [-], suggesting that the HFS model accurately extracted HFS side effect signals derived from these drugs. Note that symptoms such as acneiform rash, xerosis, eczema, paronychia, changes in the nails, arthralgia, or stiffness of limb joints, which are common side effects of EGFR inhibitors or aromatase inhibitors [,], might be extracted as closely related expressions to those of HFS signals. When looking at the MoA of anticancer drugs for patients with adverse event signals identified by the AE-L model, antimetabolite (fluoropyrimidine) was the most common one (n=53, 33.8%), as in the case of those identified by the HFS model, followed by kinase inhibitors (n=31, 19.7%) and antiandrogens (n=27, 17.2%; ). Since the AE-L model targets a broad range of adverse event symptoms, it is difficult to rationalize the relationship between the adverse event signals and types of anticancer drugs. However, the type of anticancer drugs would presumably closely correspond to the standard treatments of the cancer types of the patients. Based on the prescribed anticancer drugs, we can infer that a large percentage of the patients had breast or lung cancer, indicating that our study results were based on data from such a population. Thus, a possible direction for the expansion of this research would be adjusting the deep learning models by additional training with expressions for typical side effects associated with standard treatments of other cancer types. To interpret these results correctly, it should be noted that we could not investigate anticancer treatments conducted outside of the pharmacies (eg, the time-course relationship with intravenously administered drugs would be missed, as the administration will be done at hospitals). To further evaluate how useful this model is in side effect signal monitoring for patients with cancer, comprehensive medical information for the eligible patients would be required.

Suitability of the Deep Learning Models for Specific Adverse Event Symptoms

Among the adverse event signals identified by the AE-L model, the type of symptom was categorized according to a predefined annotation guideline that we previously developed []. The most frequently recorded adverse event signals identified by the AE-L model were “pain or numbness” (n=57, 36.3%), “fever” (n=46, 29.3%), and “nausea” (n=40, 25.5%; ). Since the pharmaceutical care records had information about interventions by health care professionals, the frequency of the presence or absence of the interventions for each symptom was examined. A trend toward a greater proportion of interventions was observed in “fever,” “nausea,” “diarrhea,” “constipation,” “vomiting,” and “edema” (, black boxes). There seem to be 2 possible explanations for this: these symptoms are of high importance and require early medical intervention or effective symptomatic treatments are available for these symptoms in clinical practice so that medical intervention is an easy option. On the other hand, a trend for a smaller proportion of adverse event signals to result in interventions was observed for “pain or numbness,” “fatigue,” “appetite loss,” “rash or itchy,” “taste disorder,” and “dizziness” (, gray boxes). The reason for this may be the lack of effective symptomatic treatments or the difficulty of judging whether the severity of these symptoms justifies medical intervention by health care providers. In either case, there may be room for improvement in the quality of medical care for these symptoms. We expect that our research will contribute to a quality improvement in safety monitoring in clinical practice by supporting adverse event signal detection in a cost-effective manner.

Potential for Deep Learning Model Implementation in Society

Although we evaluated our deep learning models using pharmaceutical care records in this study, the main target of future implementation of our deep learning models in society would be narrative texts that patients directly write to record their daily experiences. For example, the application of these deep learning models to electronic media where patients record their daily experiences in their lives with disease (eg, health care–related e-communities and disease diary applications) could enable information about adverse event signal onset that patients experience to be provided to health care providers in a timely manner. Adverse event signals can automatically be identified and shared with health care providers based on the concern texts that patients post to any platform. This system will have the advantage that health care providers can efficiently grasp safety-related events that patients experience outside of clinic visits so that they can conduct more focused or personalized interactions with patients at their clinic visits. However, consideration should be given to avoid an excessive burden on health care providers. For instance, limiting the sharing of adverse event signals to those of high severity or summarizing adverse event signals over a week rather than sharing each one in a real-time manner may be reasonable approaches for medical staff. We also need to think about how to encourage patients to record their daily experiences using electronic tools. Not only technical progress and support but also the establishment of an ecosystem where both patients and medical staff can feel benefit will be required. Prospective studies with deep learning models to follow up patients in the long term and evaluate outcomes will be needed. We primarily looked at patient-authored texts as targets of implementation, but our deep learning models may also be worth using medical data including patients’ subjective concerns, such as pharmaceutical care S records. As this study confirmed that our deep learning models are applicable to patients’ concern texts tracked by pharmacists, it should be possible to use them to analyze other “patient voice-like” medical text data that have not been actively investigated so far.

Limitations

First, the major limitation of this study was that we were not able to collect complete medical information of the patients. Although we designed this study to analyze patients’ concerns extracted by the deep learning models and their relationship with medical information contained in the pharmaceutical care records, some information could not be tracked (eg, missing history of medical interventions or anticancer treatment at hospitals as well as diagnosis of patients’ primary cancers). Second, there might be a data creation bias in S records for patients’ concerns by pharmacists. For example, symptoms that have little impact on intervention decisions might less likely be recorded by them. It should be also noted that the characteristics of S records may not be consistent at different community pharmacies.

Conclusions

Our deep learning models were able to screen clinically important adverse event signals that require intervention by health care professionals from patients’ concerns in pharmaceutical care records. Thus, these models have the potential to support real-time adverse event monitoring of individual patients taking anticancer treatments in an efficient manner. We also confirmed that these deep learning models constructed based on patient-authored texts could be applied to patients’ subjective information recorded by pharmacists through their daily work. Further research may help to expand the applicability of the deep learning models for implementation in society or for analysis of data on patients’ concerns accumulated in professional records at pharmacies or hospitals.

This work was supported by Japan Society for the Promotion of Science, Grants-in-Aid for Scientific Research (KAKENHI; grant 21H03170) and Japan Science and Technology Agency, Core Research for Evolutional Science and Technology (CREST; grant JPMJCR22N1), Japan. Mr Yuki Yokokawa and Ms Sakura Yokoyama at our laboratory advised SN about the structure of pharmaceutical care records. This study would not have been feasible without the high quality of pharmaceutical care records created by many individual pharmacists at Nakajima Pharmacy Group through their daily work.

The data sets generated and analyzed during this study are available from the corresponding author on reasonable request.

SN and SH designed the study. SN retrieved the subjective records of patients with cancer from the data source for the application of deep learning models and organized other data for subsequent evaluations. SN ran the deep learning models with the support of SW. SN, YY, and KS checked the adverse event signals for each subjective record that was extracted as positive by the models for hand-foot syndrome or adverse events limiting patients’ daily lives and evaluated the adverse event signal symptoms, details of interventions taken by health care professionals, and types of anticancer drugs prescribed for patients based on available data from the data source. HK and SI advised on the study concept and process. MS and RT provided pharmaceutical records at their community pharmacies along with advice on how to use and interpret them. SY and EA supervised the natural language processing research as specialists. SH supervised the study overall. SN drafted and finalized the paper. All authors reviewed and approved the paper.

SN is an employee of Daiichi Sankyo Co, Ltd. All other authors declare no conflicts of interest.

Edited by G Eysenbach; submitted 25.12.23; peer-reviewed by CY Wang, L Guo; comments to author 24.01.24; revised version received 14.02.24; accepted 09.03.24; published 16.04.24.

©Satoshi Nishioka, Satoshi Watabe, Yuki Yanagisawa, Kyoko Sayama, Hayato Kizaki, Shungo Imai, Mitsuhiro Someya, Ryoo Taniguchi, Shuntaro Yada, Eiji Aramaki, Satoko Hori. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 16.04.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

View original article

JOURNAL OF MEDICAL INTERNET RESEARCH

分享书签

0 0 0 0 0 0 0

More from this channel

Adverse Event Signal Detection Using Patients’ Concerns in Pharmaceutical Care Records: Evaluation of Deep Learning Models

留言 (0)