DRTerHGAT: A drug repurposing method based on the ternary heterogeneous graph attention network

The increase in the cognition of disease and pharmacology does not speed up drug research and development. Instead, a significant comparison was made between the rise in investment and the decline in output [1,2]. Because of the high investment and low income in drug research and development, investors do not favour it [3]. Therefore, a drug repurposing (repositioning) strategy was proposed. Drug repurposing is a deep dive into approved drugs to find other uses beyond the scope of the original medical indication [1]. Compared with the traditional drug discovery method, drug repurposing can reduce the development cycle by approximately five years [4]. In addition, repurposing drugs can reduce patient safety concerns due to the knowledge of known drug toxicity, metabolism, and possible side effects [5].

In the past, successful drug repurposing cases were often accidental [6]. Therefore, reasonable drug repurposing methods have been continuously proposed, especially computational drug repurposing methods, which have become efficient and promising [[7], [8], [9]]. Computational drug repurposing approaches predict drug, target, and disease interactions by integrating large-scale genomic and phenotypic data with chemical and biological activity data from hundreds of approved drugs [10].

Computational drug repurposing methods can be divided into two perspectives: the chemical perspective of drugs and the pathological perspective of diseases [11]. The central hypothesis of repurposing considers that drugs with similar structures or properties are often related to diseases with similar pathogenesis or symptoms and vice versa. Based on this idea, drug–disease relationships were predicted by using multiple drug–drug (disease–disease) similarity measures and logistic regression [12,13]. A similarity constrained matrix factorization method enabled drug repurposing through the known drug–disease correlation and the semantic information of drug and disease [14].

In addition, based on the principles of network pharmacology, the methods for drug repurposing have been continuously developed by constructing drug-related networks. The restart random walks algorithm was used to explore potential drug-target relationships in drug-protein (target) networks and drug-drug, protein-protein similarity networks [15]. A drug–drug similarity network was built considering the side effects of the drug, and the possible candidate drugs were screened according to the number of the network neighbours [16]. In the drug-target network and proteins interactions network, candidate therapeutic drugs can be rapidly screened based on their association with COVID-19-related proteins [17,18]. The progress made in these studies suggests that involving proteins as independent nodes in the network is beneficial for drug repurposing.

The development of deep learning technology has made it possible to obtain deep features of drugs and diseases, leading to the continuous proposal of drug repurposing methods based on deep learning. DeepDR firstly calculates the similarity features of multiple drug networks. Then, deep learning technology extracts the deep features of drugs. Finally, a conditional variational autoencoder is used to predict drug–disease relationships [19]. A convolutional neural network is used to extract drug–disease association features from a two-dimensional matrix containing molecular drug mechanisms and disease clinical symptom information to predict new drug–disease relationships [20]. Since graph neural networks have been widely used in drug prediction, researchers have further proposed heterogeneous graph neural networks for drug repurposing [[21], [22], [23]]. The LAGCN stacks three layers of the heterogeneous graph neural network to extract deep features from the drug–disease network for drug repurposing [24]. DRHGCN integrates homogeneous drug networks, disease networks, and heterogeneous drug–disease networks, achieving high-efficiency drug–disease prediction models [25]. In these heterogeneous networks, only drug and disease nodes are included. Recently, some studies have attempted to add protein nodes to the heterogeneous graph for deep learning. From drug–protein and disease–protein networks, BiFusion utilizes a bipartite convolutional neural network to learn the quality features of drugs and diseases [26]. RHGT extracts node features from drug-protein-disease heterogeneous graph for predicting drug and disease links [27]. When proteins are independent nodes in the network, extracting protein features effectively brings new difficulties to heterogeneous graph deep learning methods.

According to the above analysis, computational drug repurposing methods have made progress in the past, but there are still some challenges. In particular, it is difficult to build a deep learning model that can fully reflect the relationship among drugs, proteins, and diseases. This paper proposes a drug repurposing method based on the ternary heterogeneous graph attention network (DRTerHGAT). Firstly, in terms of protein features, DRTerHGAT designs a novel protein feature extraction process. In this process, a large-scale unsupervised protein language model is used to extract high-dimensional protein sequence features. The known protein interaction network information and multitask learning autoencoder are used to compress high-dimensional protein sequence features to obtain low-dimensional effective protein features. Then, to establish links among drug, protein, and disease nodes, a heterogeneous graph is constructed, which includes calculated homogeneous similarity associations among drugs, diseases, and proteins, as well as known heterogeneous associations among drugs and diseases, drugs and targets (proteins), and diseases and genes (proteins). Finally, based on the graph and the extracted protein features, the deep features of the drugs and the diseases are extracted by graph convolutional networks (GCN) and heterogeneous graph node attention networks (HGNA).

The main contributions of this work are summarized as follows.

A protein features extraction process was proposed to obtain the high-quality features of the protein. The ternary heterogeneous graph that including drugs, proteins, and diseases was constructed to comprehensively considering the relationships.

A drug repurposing model based on the ternary heterogeneous graph attention network (DRTerHGAT) was proposed to extract the deep features of drugs and diseases from the relationship among drugs, proteins, and diseases. The model was compared with other advanced models and showed excellent performance.

Combined with the literature and DRTerHGAT, it was found that Tolcapone and Miglustat have the possibility of treating Alzheimer's disease (AD).

留言 (0)

沒有登入
gif