Extracting drug-drug interactions from no-blinding texts using key semantic sentences and GHM loss

Elsevier

Available online 3 September 2022

Journal of Biomedical InformaticsHighlights•

We propose a novel method for extracting drug-drug interactions without using external drug information.

No-blinding sentences and drug entity marking enhance drug entity pairs, key semantic sentences emphasize DDI relations and the Gradient Harmonizing Mechanism loss alleviates the label noise.

The performance of the proposed method is comparable to the state-of-the-art but is considerably faster, consumes less memory and uses far fewer parameters.

Abstract

The extraction of drug-drug interactions (DDIs) is an important task in the field of biomedical research, which can reduce unexpected health risks during patient treatment. Previous work indicates that methods using external drug information have a much higher performance than those methods not using it. However, the use of external drug information is time-consuming and resource-costly. In this work, we propose a novel method for extracting DDIs which does not use external drug information, but still achieves comparable performance. First, we no longer convert the drug name to standard tokens such as DRUG0, the method commonly used in previous research. Instead, full drug names with drug entity marking are input to BioBERT, allowing us to enhance the selected drug entity pair. Second, we adopt the Key Semantic Sentence approach to emphasize the words closely related to the DDI relation of the selected drug pair. After the above steps, the misclassification of similar instances which are created from the same sentence but corresponding to different pairs of drug entities can be significantly reduced. Then, we employ the Gradient Harmonizing Mechanism (GHM) loss to reduce the weight of mislabeled instances and easy-to-classify instances, both of which can lead to poor performance in DDI extraction. Overall, we demonstrate in this work that it is better not to use drug blinding with BioBERT, and show that GHM performs better than Cross-Entropy loss if the proportion of label noise is less than 30%. The proposed model achieves state-of-the-art results with an F1-score of 84.13% on the DDIExtraction 2013 corpus (a standard English DDI corpus), which fills the performance gap (4%) between methods that rely on and do not rely on external drug information.

Keywords

Drug-drug interactions

Drug blinding

Data imbalance

Label-noise

© 2022 The Author(s). Published by Elsevier Inc.

留言 (0)

沒有登入
gif