Training a Deep Contextualized Language Model for International Classification of Diseases, 10th Revision Classification via Federated Learning: Model Development and Validation Study


IntroductionBackground

The World Health Organization published a unified classification system for diagnoses of diseases called the International Classification of Diseases (ICD), and the ICD 10th Revision (ICD-10) is widely used []. Coders classify diseases according to the rules of the ICD, and the resulting ICD codes are used for surveys, statistics, and reimbursements. The ICD-10 Clinical Modification (ICD-10-CM) is used for coding medical diagnoses and includes approximately 69,000 codes [,]. ICD-10-CM codes contain 7 digits; the structure is shown in .

Figure 1. Structure of an International Classification of Diseases, 10th Revision, Clinical Modification code. View this figure

In hospitals, diagnoses for each patient are first written as text descriptions in the electronic health record. A coder then reads these records to classify diagnoses into ICD codes. Because diagnoses are initially written as free text, the text's ambiguity makes diagnoses difficult to code. Classifying each diagnosis is very time-consuming. A discharge record may contain 1 to 20 codes. Per the estimation of a trial, coders spent 20 minutes assigning codes to each patient on average []. An automatic tool can be used to increase the efficiency of and reduce the labor for ICD classification.

Related Work

Recently, deep learning and natural language processing (NLP) models have been developed to turn plain text into vectors, making it possible to automatically classify them. Shi et al [] proposed a hierarchical deep learning model with an attention mechanism. Sammani et al [] introduced a bidirectional gated recurrent unit model to predict the first 3 or 4 digits of ICD codes based on discharge letters. Wang et al [] proposed a convolutional neural network model with an attention mechanism and gated residual network to classify Chinese records into ICD codes. Makohon et al [] showed that deep learning with an attention mechanism effectively enhances ICD-10 predictions. Previous studies also mentioned the necessity of enormous data sets and how privacy-sensitive clinical data limited the development of models for automatic ICD-10 classification [].

Federated learning has achieved impressive results in the medical field, being used to train models on multicenter data while keeping them private. Federated learning is widely used in medical image and signal analyses, such as brain imaging analysis [] and the classification of electroencephalography signals []. In the clinical NLP field, Liu et al [] proposed a 2-stage federated method that involved using clinical notes from different hospitals to extract phenotypes for medical tasks.

Previously, we applied a Word2Vec model with a bidirectional gated recurrent unit to classify ICD-10-CM codes from electronic medical records []. We analyzed the distribution of ICD-10-CM codes and extracted features from discharge notes. The model had an F1 score of 0.625 for ICD-10-CM code classification. To improve the model’s performance, we implemented bidirectional encoder representations from transformers (BERT) and found an improved F1 score of 0.715 for ICD-10-CM code classification []. We also found that the coding time decreased when coders used classification model aids; the median F1 score significantly improved from 0.832 to 0.922 (P<.05) in a trial []. Furthermore, we constructed a system to improve ease of use, comprising data processing, feature extraction, model construction, model training, and a web service interface []. Lastly, we included a rule-based algorithm in the preprocessing process and improved the F1 score to 0.853 for ICD-10-CM classification [].

Objective

This study aims to further improve the performance of the ICD-10 classification model and enable the model’s use across hospitals. In this study, we investigated the effect of federated learning on the performance of a model that was trained on medical text requiring ICD-10 classification.


MethodsEthics Approval

The study protocol was approved by the institutional review boards of Far Eastern Memorial Hospital (FEMH; approval number: 109086-F), National Taiwan University Hospital (NTUH; approval number: 201709015RINC), and Taipei Veterans General Hospital (VGHTPE; approval number: 2022-11-005AC), and the study adhered to the tenets of the Declaration of Helsinki. Informed consent was not applicable due to the use of deidentified data.

Data Collection

Our data were acquired from electronic health records at FEMH (data recorded between January 2018 and December 2020), NTUH (data recorded between January 2016 and July 2018), and VGHTPE (data recorded between January 2018 and December 2020). The data contained the text of discharge notes and ICD-10-CM codes. Coders in each hospital annotated the ground truth ICD-10 codes.

Data Description

After duplicate records were removed, our data set contained 100,334, 239,592, and 283,535 discharge notes from FEMH, NTUH, and VGHTPE, respectively. Each record contained between 1 and 20 ICD-10-CM labels. The distribution of labels for each chapter is shown in . These chapters are classified by the first three digits. Codes for chapters V01 to Y98 are not used for insurance reimbursement; hence, they were excluded from our data set. The minimum number of ICD-10-CM labels was found for chapters U00 to U99, and the maximum number was found for chapters J00 to J99. Counts of ICD-10-CM labels from the three hospitals are shown in .

The text in the data set contained alphabetic characters, punctuation, and a few Chinese characters. The punctuation count and the top 10 Chinese characters are shown in . The most common punctuation mark was the period (“.”), and the least common was the closing brace (“}”).

Figure 2. Counts of ICD-10-CM labels for 22 chapters from (A) Far Eastern Memorial Hospital, (B) National Taiwan University Hospital, and (C) Taipei Veterans General Hospital. ICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification. View this figurePreprocessing

We first removed duplicate medical records from the data set. We then transformed all full-width characters into half-width characters and all alphabetic characters into lowercase letters. Records shorter than 5 characters were removed, as these were usually meaningless words, such as “nil” and “none.” We also removed meaningless characters, such as newlines, carriage returns, horizontal tabs, and formed characters (“\n,” “\r,” “\t,” and “\f,” respectively). Finally, all text fields were concatenated.

To choose a better method for managing punctuation and Chinese characters during the preprocessing stage, we determined model performance by using FEMH data, given the inclusion of these characters in the data. Each experiment used 2 versions of the data. In the first version, we retained these specific characters, and in the second, we removed them. Experiment P investigated the effect of punctuation, experiment C investigated the effect of Chinese characters, and experiment PC investigated the effects of both punctuation and Chinese characters. Another method of retaining Chinese character information is using English translations of Chinese characters. Therefore, we also compared the model’s performance when Chinese characters were retained to its performance when Google Translate was used to obtain English translations.

One-hot encoding was used for the labels. Of the 69,823 available ICD-10-CM codes, 17,745 appeared in our combined data set, resulting in a one-hot encoding vector length of 17,745. The final cohort comprised 100,334, 239,592, and 283,535 records from FEMH, NTUH, and VGHTPE, respectively; 20% (FEMH: 20,067/100,334; NTUH: 47,918/239,592; VGHTPE: 56,707/283,535) of the records were randomly selected for the testing set, and the remaining records were used as the training set.

Classification Model

We compared the performance of different variants of BERT, including PubMedBERT [], RoBERTa (Robustly Optimized BERT Pretraining Approach) [], ClinicalBERT [], and BioBERT (BERT for Biomedical Text Mining) []. BioBERT was pretrained with text from PubMed—the most popular bibliographic database in the health and medical science fields. ClinicalBERT was pretrained with the MIMIC-III (Medical Information Mart for Intensive Care III) data set, and its vocabulary was from English Wikipedia and the BookCorpus data set. PubMedBERT is another variant of BERT that uses training data from PubMed. The main difference between PubMedBERT and BioBERT is their vocabularies. The vocabulary of BioBERT was from English Wikipedia and the BookCorpus data set—as was the vocabulary of BERT—whereas that of PubMedBERT was from PubMed. This difference in vocabularies affects the ability to recognize words in clinical text. RoBERTa used the original BERT model, but it also used a longer training time, a larger batch size, and more training data. The training data were from the BookCorpus, CC-News (CommonCrawl News), and OpenWebText data sets. RoBERTa also applied dynamic masking, which meant that the masked tokens would be changed multiple times instead of being fixed in the original BERT. The vocabularies and corpora of these BERT variants are summarized in .

For our comparison, the text was first fed into the BERT tokenizer, which transformed strings into tokens. The number of tokens was then truncated to 512 for every text datum that met the input length limit of 512. A linear layer connected the word embeddings produced from the models to the output layers of the one-hot–encoded multilabels. The output size of the linear layer was 17,745, which matched the one-hot encoding vector size of the labels. Binary cross-entropy was used to calculate the model loss. We trained our model for 100 epochs, with a learning rate of 0.00005. These models were fine-tuned for our ICD-10-CM multilabel classification task to compare their performance. summarizes the model architecture and preprocessing flowchart. The best-performing model and preprocessing method were chosen for subsequent federated learning.

Table 1. Summary of the vocabulary and corpus sources for the various bidirectional encoder representations from transformers (BERT) models.ModelsVocabulary sourcesCorpus sources (training data)PubMedBERTPubMedPubMedRoBERTaaThe BookCorpus, CC-Newsb, and OpenWebText data setsThe BookCorpus, CC-News, and OpenWebText data setsClinicalBERTEnglish Wikipedia and the BookCorpus data setThe MIMIC-IIIc data setBioBERTdEnglish Wikipedia and the BookCorpus data setPubMed

aRoBERTa: Robustly Optimized BERT Pretraining Approach.

bCC-News: CommonCrawl News.

cMIMIC-III: Medical Information Mart for Intensive Care III.

dBioBERT: BERT for Biomedical Text Mining.

Figure 3. Model architecture and processing flowchart. CLS: class token; ICD-10-CM: International Classification of Diseases, 10th Revision, Clinical Modification. View this figureFederated Learning

With federated learning, a model can be trained without sharing data []. Clients (ie, local machines) keep their training data on the same model architecture while exchanging the weights of model parameters. A server receives the weights from each client and averages their weights. After updating the model, the server sends new weights back to the clients. The clients can then start a new training round. We updated the weights of our model parameters with the FederatedAveraging algorithm [] and used Flower for federated learning [].

Flower is an open-source federated learning framework for researchers []. Flower has a server-client structure. The server and clients need to be started individually, and a server needs to be assigned to each client. They communicate via the open-source Google Remote Procedure Call (gRPC; Google LLC) []. With the gRPC, a client application can directly call a method on a server application, and this can be done on different machines. There is a registration center on the server for managing communication with all clients. There are 3 main modules in the server. The first—a connection management module—maintains all current gRPC connections. On the server, each gRPC corresponds to each client. When a gRPC is established, the register function is triggered to store the clients’ information in an array. If a client initiates a disconnection or the connection times out, the register function will be called to clear the client. The second module—a bridge module—caches the information, regardless of whether the gRPC information from the clients or the server will be stored in the module. However, since the buffer is shared in both directions, it is necessary to use the state transition method to ensure that all of the information in the buffer is the same. There are five states—the close, waiting for client write, waiting for client read, waiting for server write, and waiting for server read states. The third module—a server handler—manages the traffic between the server and the clients.

Clients were set in the three hospitals, where the model was trained on local data. The weights from each client were transferred to the server, where the weights were averaged, and global models were made (). We set 5 epochs for each training round on clients and 20 rounds for the server aggregation. Our study was conducted on 2 nodes. Each node had a NVIDIA RTX 2080 Ti graphics processing unit (NVIDIA Corporation) with 64 GB of RAM, and one node had 2 NVIDIA TITAN RTX graphics processing units with 64 GB of RAM (NVIDIA Corporation).

Figure 4. Federated learning architecture. FEMH: Far Eastern Memorial Hospital; NTUH: National Taiwan University Hospital; VGHTPE: Taipei Veterans General Hospital. View this figureLabel Attention

To explain the outputs of our model, we added a label attention architecture []. It calculated the attention based on the inner products of word vectors and each label vector separately. shows how we added the label attention architecture to our model. First, we fine-tuned the BERT model by using the definitions of ICD-10-CM codes to generate the label vectors. Second, we constructed a fully connected layer, of which the weights were initialized with the label vectors. Third, the output produced by BERT was passed through the hyperbolic tangent function, thereby producing word vectors. We inputted the word vectors (Ζ) into the fully connected layer and softmax layer. The output (⍺) of the softmax layer was the attention. Fourth, we inputted the hyperbolic tangent function of word vectors (H), which were multiplied by attention (⍺), into another fully connected layer and sigmoid layer. This was similar to our original architecture. The output (y) could be subtracted from the one-hot–encoded labels for the loss calculation. Finally, attention was used to explain how the model predicted the labels. Attention was given to the input text for corresponding ICD-10-CM codes. The performance of the model after adding the label attention architecture was compared to its performance without this architecture.

Figure 5. Our model architecture with label attention. BERT: bidirectional encoder representations from transformers. View this figureMetrics

We used the micro F1 score to evaluate performance because it is the harmonic mean of precision and recall and therefore yields more balanced results than those yielded when using precision or recall only. The micro F1 score was calculated as follows:

where

and

TPsum indicates the sum of true positives, FPsum indicates the sum of false positives, and FNsum indicates the sum of false negatives.


ResultsComparing the Performance of Different BERT Models

The F1 scores of PubMedBERT, RoBERTa, ClinicalBERT, and BioBERT were 0.735, 0.692, 0.711, and 0.721, respectively. The F1 score of PubMedBERT was the highest, and that of RoBERTa was the lowest among all models (). Due to these results, we used PubMedBERT in the subsequent experiments.

Table 2. Performance of different bidirectional encoder representations from transformers (BERT) models.ModelsF1 scorePrecisionRecallPubMedBERT0.7350.7560.715RoBERTaa0.6920.7190.666ClinicalBERT0.7110.7350.689BioBERTb0.7210.7540.691

aRoBERTa: Robustly Optimized BERT Pretraining Approach.

bBioBERT: BERT for Biomedical Text Mining.

The Model’s Performance When Retaining or Removing Punctuation or Chinese Characters

shows the mean number of tokens for each data set preprocessing case. The mean number of tokens when removing punctuation and Chinese characters was 52.9. The mean number of tokens when the characters were retained in experiment P (punctuation), experiment C (Chinese characters), and experiment PC (punctuation and Chinese characters) was 65.0, 53.1, and 65.1, respectively. Punctuation and Chinese characters comprised 18.3% (1,301,988/7,096,460) and 0.1% (7948/7,096,460) of the tokens in our data, respectively.

Table 3. Mean number of data tokens for retaining or removing punctuation or Chinese characters.ExperimentMean number of tokensRemoved punctuation and Chinese characters (baseline)52.9Retained punctuation65.0Retained Chinese characters53.1Retained punctuation and Chinese characters65.1

shows the F1 scores for each data set preprocessing case. The baseline performance of the model after removing punctuation and Chinese characters was 0.7875. In experiment P, the F1 score for retaining punctuation was 0.8049—an increase of 0.0174 (2.21%). In experiment C, the F1 score for retaining Chinese characters was 0.7984—an increase of 0.0109 (1.38%). In experiment PC, the F1 score for retaining punctuation and Chinese characters was 0.8120—an increase of 0.0245 (3.11%). In all experiments, retaining these characters was better than removing them, with experiment PC showing the largest improvement in performance.

Table 4. F1 scores for retaining or removing punctuation or Chinese characters.ExperimentF1 scoreAbsolute increases (percentage)Removed punctuation and Chinese characters (baseline)0.7875N/AaRetained punctuation0.80490.0174 (2.21%)Retained Chinese characters0.79840.0109 (1.38%)Retained punctuation and Chinese characters0.81200.0245 (3.11%)

aN/A: not applicable.

The Model’s Performance Before and After Translation

In the experiment where we translated Chinese into English, the F1 score for retaining the Chinese characters was 0.7984, and that for translating them into English was 0.7983.

Federated Learning

shows the performance of the models that were trained in the three hospitals. The models trained in FEMH, NTUH, and VGHTPE had validation F1 scores of 0.7802, 0.7718, and 0.6151, respectively. The FEMH model had testing F1 scores of 0.7412, 0.5116, and 0.1596 on the FEMH, NTUH, and VGHTPE data sets, respectively. The NTUH model had testing F1 scores of 0.5583, 0.7710, and 0.1592 on the FEMH, NTUH, and VGHTPE data sets, respectively. The VGHTPE model had testing F1 scores of 0.1081, 0.1058, and 0.5692 on the FEMH, NTUH, and VGHTPE data sets, respectively. The weighted average testing F1 scores were 0.4472, 0.5353, and 0.2522 for the FEMH, NTUH, and VGHTPE models, respectively.

shows the federated learning model’s performance in the three hospitals. The federated learning model had validation F1 scores of 0.7464, 0.6511, and 0.5979 on the FEMH, NTUH, and VGHTPE data sets, respectively. The federated learning model had testing F1 scores of 0.7103, 0.6135, and 0.5536 on the FEMH, NTUH, and VGHTPE data sets, respectively. The weighted average testing F1 score was 0.6142 for the federated learning model.

Table 5. Models that were trained in the three hospitals for International Classification of Diseases, 10th Revision classification.HospitalsValidation F1 scoreTesting F1 scoresWeighted average testing F1 scoresFEMHa0.78020.7412 (FEMH)
‎ 0.5116 (NTUHb)
‎ 0.1596 (VGHTPEc)
‎ 0.4472NTUH0.77180.5583 (FEMH)
‎ 0.7710 (NTUH)
‎ 0.1592 (VGHTPE)
‎ 0.5353VGHTPE0.61510.1081 (FEMH)
‎ 0.1058 (NTUH)
‎ 0.5692 (VGHTPE)
‎ 0.2522

aFEMH: Far Eastern Memorial Hospital.

bNTUH: National Taiwan University Hospital.

cVGHTPE: Taipei Veterans General Hospital.

Table 6. The federated learning model’s performance in the three hospitals.DataValidation F1 scoreTesting F1 scoreaFEMHb data0.74640.7103NTUHc data0.65110.6135VGHTPEd data0.59790.5536

aThe weighted average testing F1 score was 0.6142.

bFEMH: Far Eastern Memorial Hospital.

cNTUH: National Taiwan University Hospital.

dVGHTPE: Taipei Veterans General Hospital.

Label Attention

The F1 scores of the model with and without the label attention mechanism were 0.804 (precision=0.849; recall=0.763) and 0.813 (precision=0.852; recall=0.777), respectively.

shows a visualization of the attention for ICD-10-CM codes and their related input text. The words were colored blue based on the attention scores for different labels. The intensity of the blue color represented the magnitude of the attention score. We used ICD-10-CM codes E78.5 (“Hyperlipidemia, unspecified”) and I25.10 (“Atherosclerotic heart disease of native coronary artery without angina pectoris”) as examples.

Figure 6. Attention for International Classification of Diseases, 10th Revision, Clinical Modification codes (A) E78.5 (“Hyperlipidemia, unspecified”) and (B) I25.10 (“Atherosclerotic heart disease of native coronary artery without angina pectoris”). The intensity of the blue color represents the magnitude of the attention score. View this figure
DiscussionPrincipal Findings

The federated learning model outperformed each local model when tested on external data. The weighted average F1 scores on the testing set were 0.6142, 0.4472, 0.5353, and 0.2522 for the federated learning, FEMH, NTUH, and VGHTPE models, respectively ( and ). The model’s performance decreased when tested on external data. Because different doctors, coders, and diseases are found in different hospitals, the style of clinical notes may be distinct across hospitals. Overcoming such gaps among hospitals is challenging. Although the performance of the federated learning model was inferior to that of the models trained on local data when tested on local data, its performance was higher than that of the models trained on local data when tested on external data. Moreover, in the VGHTPE data set, the label distribution was very different from the label distributions in the other two hospitals’ data sets (). Therefore, the VGHTPE model only achieved F1 scores of 0.1058 and 0.1081 on the NTUH and FEMH testing sets, respectively. The FEMH and NTUH models had F1 scores of 0.1596 and 0.1592, respectively, on the VGHTPE testing set ().

Federated learning improves model performance on external data. Federated learning can be used to build an ICD coding system for use across hospitals. However, the training time required for federated learning is longer than the training time required for local deep learning. Federated learning takes approximately 1 week, and local training takes approximately 2 days. There are 2 reasons for this. First, the communication between the server and the clients takes longer if the model is large. The size of our model is approximately 859 MB. Second, different clients may have different computing powers, and the slower client becomes a bottleneck [,]. Other clients may wait for the slower client until it completes its work.

The performance of PubMedBERT was better than that of BioBERT, ClinicalBERT, and RoBERTa. shows that the vocabulary of BERT models is an important factor of model performance. The vocabulary of PubMedBERT contains predominantly medical terms, whereas the vocabularies of the other three models contain common words. This difference affects the ability to recognize words in clinical text. Most published BERT models use a vocabulary of 30,522 WordPieces []. However, these vocabulary data do not contain some words from special fields. For example, the medical term “lymphoma” is in the vocabulary of PubMedBERT but not in the vocabularies of BioBERT, ClinicalBERT, and RoBERTa. The term “lymphoma” can be transformed into the token “lymphoma” by the PubMedBERT tokenizer, but the term would be split into 3 tokens—“l”, “##ymph”, and “##oma”—by BioBERT, ClinicalBERT, and RoBERTa.

In most scenarios, nonalphanumeric characters are removed because they are considered useless to the models []. In contrast to models with attention mechanisms, early NLP models could not pay attention to punctuation. Additional characters would make the models unable to focus well on keywords. The removal of punctuation in English text and text in other languages, such as Arabic, has been performed for NLP []. Ek et al [] compared 2 data sets of daily conversation text—one retained punctuation, and the other did not. Their results showed better performance for the data set that retained punctuation.

For experiments P, C, and PC, all models performed better when additional characters were retained (). Experiment P demonstrated that PubMedBERT could use embedded punctuation. As punctuation marks are used to separate different sentences, removing them connects all sentences and thus makes it harder for a model to understand the text content. The improvement in our F1 score for retaining punctuation is similar to the results of previous work by Ek et al []. Our results demonstrate that retaining punctuation can improve the performance of text classification models for text from the clinical field. Experiment C demonstrated that PubMedBERT could use embedded Chinese characters. Although PubMedBERT was pretrained with mostly English text, its vocabulary contains many Chinese characters. The tokens from Chinese characters may contribute to the ICD-10 classification task for clinical text because they provide information such as place names, trauma mechanisms, and local customs. The results of experiment PC indicate that the benefits of retaining punctuation and retaining Chinese characters are additive. In the translation experiment, the F1 scores did not considerably differ. This result indicates that the model can extract information from clinical text in either English or Chinese. The use of the attention mechanisms of BERT increased our model’s ability to pay attention to keywords. Punctuation and Chinese characters contribute helpful information to these models. Therefore, this preprocessing strategy—retaining more meaningful tokens—provides more information for ICD-10 classification task models.

In our previous study, we introduced an attention mechanism to visualize the attention given to the input text for ICD-10 definitions []. Through this approach, we trained a model to predict ICD-10 codes and trained another model to extract attention data. This approach might result in inconsistencies between the predictions and attention. In this study, we introduced the label attention architecture to visualize the attention given to the input text for ICD-10 codes []. This method better illustrated the attention given to the input words that were used to predict ICD codes, as it is consistent with the methods used by prediction models.

The F1 score of the model, after the label attention mechanism was added, decreased by 0.009. Although the F1 score decreased, we obtained explainable predictions. For ICD-10-CM codes E78.5 (“Hyperlipidemia, unspecified”) and I25.10 (“Atherosclerotic heart disease of native coronary artery without angina pectoris”), our model successfully paid great attention to the related words “hyperlipidemia” and “coronary artery” (). Our visualization method (ie, highlighting input words) allows users to understand how our model identified ICD-10-CM codes from text.

Limitations

Our study has several limitations. First, our data were acquired from 3 tertiary hospitals in Taiwan. The extrapolation of our results to hospitals in other areas should be studied in the future. Second, although our results suggest that model performance is better when punctuation and Chinese characters are retained, this effect may be restricted to specific note types. This finding should be further examined in the context of classifying other types of clinical text. Third, the translated text in our last experiment may not be as accurate as translations by a native speaker. However, it is difficult to manually translate large amounts of data. As such, we could only automatically translate the text by using Google Translate.

It should be noted that there is a primary and secondary diagnosis code for each discharge note. Although choosing the primary code makes reimbursements different, the model proposed in this study did not identify primary codes. To make our model capable of identifying a primary code, we proposed a sequence-to-sequence model in our previous work []. It transforms the original predicted labels that were concatenated alphabetically, so that they are ordered by diagnosis code. This structure can be added to the model proposed in this study. Predictions based on primary and secondary diagnosis codes can further improve the usability of this system.

Conclusions

Federated learning was used to train the ICD-10 classification model on multicenter clinical text while protecting data privacy. The model’s performance was better than that of models that were trained locally. We showed the explainable predictions by highlighting input words via a label attention architecture. We also found that the PubMedBERT model can use the meanings of punctuation and non-English characters. This finding demonstrates that changing the preprocessing method for ICD-10 multilabel classification can improve model performance.

This study was supported by grants from the Ministry of Science and Technology, Taiwan (grants MOST 110-2320-B-075-004-MY and MOST 110-2634-F-002-032-); Far Eastern Memorial Hospital, Taiwan (grant FEMH-2022-C-058); and Taipei Veterans General Hospital (grants V111E-002 and V111E-005-2). The sponsors had no role in the study design, data collection and analysis, publication decision, or manuscript drafting.

None declared.

Edited by C Lovis; submitted 24.07.22; peer-reviewed by I Li, N Nuntachit; comments to author 15.08.22; revised version received 03.10.22; accepted 08.10.22; published 10.11.22

©Pei-Fu Chen, Tai-Liang He, Sheng-Che Lin, Yuan-Chia Chu, Chen-Tsung Kuo, Feipei Lai, Ssu-Ming Wang, Wan-Xuan Zhu, Kuan-Chih Chen, Lu-Cheng Kuo, Fang-Ming Hung, Yu-Cheng Lin, I-Chang Tsai, Chi-Hao Chiu, Shu-Chih Chang, Chi-Yu Yang. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 10.11.2022.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.

留言 (0)

沒有登入
gif