Enabling data-limited chemical bioactivity predictions through deep neural network transfer learning

Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferrero E, Agapow PM, Zietz M, Hoffman MM, Xie W, Rosen GL, Lengerich BJ, Israeli J, Lanchantin J, Woloszynek S, Carpenter AE, Shrikumar A, Xu J, Cofer EM, Lavender CA, Turaga SC, Alexandari AM, Lu Z, Harris DJ, De Caprio D, Qi Y, Kundaje A, Peng Y, Wiley LK, Segler MHS, Boca SM, Swamidass SJ, Huang A, Gitter A, Greene CS (2018) Opportunities and obstacles for deep learning in biology and medicine. J R Soc Interface 15:20170387

PubMed  PubMed Central  Google Scholar 

Loiodice S, Nogueira da Costa A, Atienzar F (2019) Current trends in in silico, in vitro toxicology, and safety biomarkers in early drug development. Drug Chem Toxicol 42:113–121

Article  CAS  Google Scholar 

Muster W, Breidenbach A, Fischer H, Kirchner S, Muller L, Pahler A (2008) Computational toxicology in drug development. Drug Discov Today 13:303–310

Article  CAS  Google Scholar 

Valerio LG Jr (2009) In silico toxicology for the pharmaceutical sciences. Toxicol Appl Pharmacol 241:356–370

Article  CAS  Google Scholar 

Keyvanpour MR, Shirzad MB (2021) An analysis of QSAR research based on machine learning concepts. Curr Drug Discov Technol 18:17–30

Article  CAS  Google Scholar 

Piir G, Kahn I, Garcia-Sosa AT, Sild S, Ahte P, Maran U (2018) Best practices for QSAR model reporting: physical and chemical properties, ecotoxicity, environmental fate, human health, and toxicokinetics endpoints. Environ Health Perspect 126:126001. https://doi.org/10.1289/EHP3264

Article  CAS  PubMed  PubMed Central  Google Scholar 

Tropsha A, Golbraikh A (2007) Predictive QSAR modeling workflow, model applicability domains, and virtual screening. Curr Pharm Des 13:3494–3504

Article  CAS  Google Scholar 

Neves BJ, Braga RC, Melo-Filho CC, Moreira-Filho JT, Muratov EN, Andrade CH (2018) QSAR-based virtual screening: advances and applications in drug discovery. Front Pharmacol 9:1275. https://doi.org/10.3389/fphar.2018.01275

Article  CAS  PubMed  PubMed Central  Google Scholar 

Mao J, Akhtar J, Zhang X, Sun L, Guan S, Li X, Chen G, Liu J, Jeon HN, Kim MS, No KT, Wang G (2021) Comprehensive strategies of machine-learning-based quantitative structure-activity relationship models. iScience 24:103052. https://doi.org/10.1016/j.isci.2021.103052

Article  CAS  PubMed  PubMed Central  Google Scholar 

Tropsha A (2010) Best practices for QSAR model development, validation, and exploitation. Mol Inform 29:476–488

Article  CAS  Google Scholar 

Shaikhina T, Khovanova NA (2017) Handling limited datasets with neural networks in medical applications: a small-data approach. Artif Intell Med 75:51–63

Article  Google Scholar 

Sosnin S, Vashurina M, Withnall M, Karpov P, Fedorov M, Tetko IV (2019) A survey of multi-task learning methods in chemoinformatics. Mol Inform 38:e1800108. https://doi.org/10.1002/minf.201800108

Article  CAS  PubMed  Google Scholar 

Deng J, Dong W, Socher R, Li L, Li K, Li F (2009) ImageNet: A large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition, pp 248–255. https://doi.org/10.1109/CVPR.2009.5206848

Emmert-Streib F, Yang Z, Feng H, Tripathi S, Dehmer M (2020) An introductory review of deep learning for prediction models with big data. Front Artif Intell 3:4. https://doi.org/10.3389/frai.2020.00004

Article  PubMed  PubMed Central  Google Scholar 

LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521:436–444

Article  CAS  Google Scholar 

Zhuang F, Qi Z, Duan K, Xi D, Zhu Y, Zhu H, Xiong H, He Q (2021) A comprehensive survey on transfer learning. Proc IEEE 109:43–76

Article  Google Scholar 

Zhuang D, Ibrahim AK (2021) Deep learning for drug discovery: a study of identifying high efficacy drug compounds using a cascade transfer learning approach. Appl Sci 11:7772. https://doi.org/10.3390/app11177772

Article  CAS  Google Scholar 

Li Y, Xu Y, Yu Y (2021) CRNNTL: convolutional recurrent neural network and transfer learning for QSAR modeling in organic drug and material discovery. Molecules 26:7257. https://doi.org/10.3390/molecules26237257

Article  CAS  PubMed  PubMed Central  Google Scholar 

Yamda H, Liu C, Wu S, Koyama Y, Ju S, Shiomi J, Morikawa J, Yoshida R (2019) Predicting materials properties with little data using shotgun transfer learning. ACS Cent Sci 5:1717–1730

Article  Google Scholar 

Cai C, Wang S, Xu Y, Zhang W, Tang K, Quyang Q, Lai L, Pei J (2020) Transfer learning for drug discovey. J Med Chem 63:8683–8694

Article  CAS  Google Scholar 

Hu S, Chen P, Gu P, Wang B (2020) A deep learning-based chemical system for QSAR prediction. IEEE J Biomed Health Inform 24:3020–3028

Article  Google Scholar 

Fernandez-Torras A, Comajuncosa-Creus A, Duran-Frigola M, Aloy P (2022) Connecting chemistry and biology through molecular descriptors. Curr Opin Chem Biol 66:102090. https://doi.org/10.1016/j.cbpa.2021.09.001

Article  CAS  PubMed  Google Scholar 

Chuang KV, Gunsalus LM, Keiser MJ (2020) Learning molecular representations for medicinal chemistry. J Med Chem 63:8705–8722

Article  CAS  Google Scholar 

Xue L, Bajorath J (2000) Molecular descriptors in chemoinformatics, computational combinatorial chemistry, and virtual screening. Comb Chem High Throughput Screen 3:363–372

Article  CAS  Google Scholar 

Sahoo S, Adhikari C, Kuanar M, Mishra BK (2016) A short review of the generation of molecular descriptors and their applications in quantitative structure property/activity relationships. Curr Comput Aided Drug Des 12:181–205

Article  CAS  Google Scholar 

Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754

Article  CAS  Google Scholar 

Broccatelli F, Trager R, Reutlinger M, Karypis G, Li M (2022) Benchmarking accuracy and generalizability of four graph neural networks using large in vitro ADME datasets from different chemical spaces. Mol Inform. https://doi.org/10.1002/minf.202100321

Article  PubMed  Google Scholar 

Carracedo-Reboredo P, Linares-Blanco J, Rodriguez-Fernandez N, Cedron F, Novoa FJ, Carballal A, Maojo V, Pazos A, Fernandez-Lozano C (2021) A review on machine learning approaches and trends in drug discovery. Comput Struct Biotechnol J 19:4538–4558

Article  CAS  Google Scholar 

Deng D, Chen X, Zhang R, Lei Z, Wang X, Zhou F (2021) XGraphBoost: extracting graph neural network-based features for a better prediction of molecular properties. J Chem Inf Model 61:2697–2705

Article  CAS  Google Scholar 

Jiang D, Wu Z, Hsieh CY, Chen G, Liao B, Wang Z, Shen C, Cao D, Wu J, Hou T (2021) Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models. J Cheminform 13:12. https://doi.org/10.1186/s13321-020-00479-8

Article  CAS  PubMed  PubMed Central  Google Scholar 

Yang K, Swanson K, Jin W, Coley C, Eiden P, Gao H, Guzman-Perez A, Hopper T, Kelley B, Mathea M, Palmer A, Settels V, Jaakkola T, Jensen K, Barzilay R (2019) Analyzing learned molecular representations for property prediction. J Chem Inf Model 59:3370–3388

Article  CAS  Google Scholar 

Wieder O, Kohlbacher S, Kuenemann M, Garon A, Ducrot P, Seidel T, Langer T (2020) A compact review of molecular property prediction with graph neural networks. Drug Discov Today Technol 37:1–12

Article  Google Scholar 

Sun M, Zhao S, Gilvary C, Elemento O, Zhou J, Wang F (2020) Graph convolutional networks for computational drug development and discovery. Brief Bioinform 21:919–935

Article  Google Scholar 

Shoemaker RH (2006) The NCI60 human tumour cell line anticancer drug screen. Nat Rev Cancer 6:813–823

Article  CAS  Google Scholar 

Close DA, Wang AX, Kochanek SJ, Shun T, Eiseman JL, Johnston PA (2019) Implementation of the NCI-60 human tumor cell line panel to screen 2260 cancer drug combinations to generate >3 million data points used to populate a large matrix of anti-neoplastic agent combinations (ALMANAC) database. SLAS Discov 24:242–263

Article  CAS  Google Scholar 

Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35:D198-201

Article  CAS  Google Scholar 

Wang Y, Bryant SH, Cheng T, Wang J, Gindulyte A, Shoemaker BA, Thiessen PA, He S, Zhang J (2017) PubChem BioAssay: 2017 update. Nucleic Acids Res 45:D955–D963

Article  CAS  Google Scholar 

Gadaleta D, Vukovic K, Toma C, Lavado GJ, Karmaus AL, Mansouri K, Kleinstreuer NC, Benfenati E, Roncaglioni A (2019) SAR and QSAR modeling of a large collection of LD50 rat acute oral toxicity data. J Cheminform 11:58. https://doi.org/10.1186/s13321-019-0383-2

Article  PubMed  PubMed Central  Google Scholar 

Sorkun MC, Khetan A, Er S (2019) AqSolDB, a curated reference set of aqueous solubility and 2D descriptors for a diverse set of compounds. Sci Data 6:143. https://doi.org/10.7910/DVN/OVHAW8

Article  PubMed  PubMed Central  Google Scholar 

Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In Advances in neural information processing systems 2546–2554.

Ma J, Sheridan RP, Liaw A, Dahl GE, Svetnik V (2015) Deep neural nets as a method for quantitative structure-activity relationships. J Chem Inf Model 55:263–274

Article  CAS  Google Scholar 

Ramsundar B, Liu B, Wu Z, Verras A, Tudor M, Sheridan RP, Pande V (2017) Is multitask deep learning practical for pharma? J Chem Inf Model 57:2068–2076

留言 (0)

沒有登入
gif