MASSA Algorithm: an automated rational sampling of training and test subsets for QSAR modeling

Yang X, Wang Y, Byrne R et al (2019) Concepts of artificial intelligence for computer-assisted drug discovery. Chem Rev 119:10520–10594. https://doi.org/10.1021/acs.chemrev.8b00728

Article  CAS  PubMed  Google Scholar 

Masand VH, Mahajan DT, Nazeruddin GM et al (2015) Effect of information leakage and method of splitting (rational and random) on external predictive ability and behavior of different statistical parameters of QSAR model. Med Chem Res 24:1241–1264. https://doi.org/10.1007/s00044-014-1193-8

Article  CAS  Google Scholar 

Andrada MF, Vega-Hissi EG, Estrada MR, Garro Martinez JC (2017) Impact assessment of the rational selection of training and test sets on the predictive ability of QSAR models. SAR QSAR Environ Res 28:1011–1023. https://doi.org/10.1080/1062936X.2017.1397056

Article  CAS  PubMed  Google Scholar 

Clark DE (2006) What has computer-aided molecular design ever done for drug discovery? Expert Opin Drug Discov 1:103–110. https://doi.org/10.1517/17460441.1.2.103

Article  CAS  PubMed  Google Scholar 

International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use (2017) Assessment and Control of DNA Reactive (Mutagenic) Impurities in Pharmaceuticals to Limit Potential Carcinogenic Risk

Martin TM, Harten P, Young DM et al (2012) Does Rational selection of training and test sets improve the outcome of QSAR modeling? J Chem Inf Model 52:2570–2578. https://doi.org/10.1021/ci300338w

Article  CAS  PubMed  Google Scholar 

Cherkasov A, Muratov EN, Fourches D et al (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57:4977–5010. https://doi.org/10.1021/jm4004285

Article  CAS  PubMed  PubMed Central  Google Scholar 

Muratov EN, Bajorath J, Sheridan RP et al (2020) QSAR without borders. Chem Soc Rev 49:3525–3564. https://doi.org/10.1039/D0CS00098A

Article  CAS  PubMed  PubMed Central  Google Scholar 

Puzyn T, Mostrag-Szlichtyng A, Gajewicz A et al (2011) Investigating the influence of data splitting on the predictive ability of QSAR/QSPR models. Struct Chem 22:795–804. https://doi.org/10.1007/s11224-011-9757-4

Article  CAS  Google Scholar 

Esbensen KH, Geladi P (2010) Principles of proper validation: use and abuse of re-sampling for validation. J Chemom 24:168–187. https://doi.org/10.1002/cem.1310

Article  CAS  Google Scholar 

Hawkins DM, Basak SC, Mills D (2003) Assessing model fit by cross-validation. J Chem Inf Comput Sci 43:579–586. https://doi.org/10.1021/ci025626i

Article  CAS  PubMed  Google Scholar 

Golbraikh A, Tropsha A (2000) Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection. Mol Divers 5:231–243. https://doi.org/10.1023/A:1021372108686

Article  CAS  Google Scholar 

Golbraikh A, Shen M, Xiao Z et al (2003) Rational selection of training and test sets for the development of validated QSAR models. J Comput Aided Mol Des 17:241–253. https://doi.org/10.1023/A:1025386326946

Article  CAS  PubMed  Google Scholar 

Wu W, Walczak B, Massart DL et al (1996) Artificial neural networks in classification of NIR spectral data: design of the training set. Chemom Intell Lab Syst 33:35–46. https://doi.org/10.1016/0169-7439(95)00077-1

Article  CAS  Google Scholar 

Kronenberger T, Windshügel B, Wrenger C et al (2018) On the relationship of anthranilic derivatives structure and the FXR (Farnesoid X receptor) agonist activity. J Biomol Struct Dyn 36:4378–4391. https://doi.org/10.1080/07391102.2017.1417161

Article  CAS  PubMed  Google Scholar 

Veríssimo GC, Menezes Dutra EF, Teotonio Dias AL et al (2019) HQSAR and random forest-based QSAR models for anti-T. vaginalis activities of nitroimidazoles derivatives. J Mol Graph Model 90:180–191. https://doi.org/10.1016/j.jmgm.2019.04.007

Article  CAS  PubMed  Google Scholar 

Gomes RA, Genesi GL, Maltarollo VG, Trossini GHG (2017) Quantitative structure–activity relationships (HQSAR, CoMFA, and CoMSIA) studies for COX-2 selective inhibitors. J Biomol Struct Dyn 35:1436–1445. https://doi.org/10.1080/07391102.2016.1185379

Article  CAS  PubMed  Google Scholar 

de Fernandes PO, Martins JPA, de Melo EB et al (2021) Quantitative structure-activity relationship and machine learning studies of 2-thiazolylhydrazone derivatives with anti-Cryptococcus neoformans activity. J Biomol Struct Dyn. https://doi.org/10.1080/073911021935321

Article  PubMed  Google Scholar 

Kronenberger T, Asse LR, Wrenger C et al (2017) Studies of Staphylococcus aureus FabI inhibitors: fragment-based approach based on holographic structure–activity relationship analyses. Future Med Chem 9:135–151. https://doi.org/10.4155/fmc-2016-0179

Article  CAS  PubMed  Google Scholar 

Ferreira GM, de Magalhães JG, Maltarollo VG et al (2020) QSAR studies on the human sirtuin 2 inhibition by non-covalent 7,5,2-anilinobenzamide derivatives. J Biomol Struct Dyn 38:354–363. https://doi.org/10.1080/07391102.2019.1574603

Article  CAS  PubMed  Google Scholar 

Maltarollo VG (2019) Classification of Staphylococcus aureus FabI inhibitors by machine learning techniques. IJQSPR 4:1–14. https://doi.org/10.4018/IJQSPR.2019100101

Article  CAS  Google Scholar 

Primi MC, Maltarollo VG, Magalhães JG et al (2016) Convergent QSAR studies on a series of NK3 receptor antagonists for schizophrenia treatment. J Enzyme Inhib Med Chem 31:283–294. https://doi.org/10.3109/14756366.2015.1021250

Article  CAS  PubMed  Google Scholar 

Popova M, Isayev O, Tropsha A (2018) Deep reinforcement learning for de novo drug design. Sci Adv 4:eaap7885. https://doi.org/10.1126/sciadv.aap7885

Article  CAS  PubMed  PubMed Central  Google Scholar 

Schneider G (2019) Mind and machine in drug design. Nat Mach Intell 1:128–130. https://doi.org/10.1038/s42256-019-0030-7

Article  Google Scholar 

Dara S, Dhamercherla S, Jadav SS et al (2022) Machine learning in drug discovery: a review. Artif Intell Rev 55:1947–1999. https://doi.org/10.1007/s10462-021-10058-4

Article  PubMed  Google Scholar 

Ambure P, Halder AK, González Díaz H, Cordeiro MNDS (2019) QSAR-Co: an open source software for developing robust multitasking or multitarget classification-based QSAR models. J Chem Inf Model 59:2538–2544. https://doi.org/10.1021/acs.jcim.9b00295

Article  CAS  PubMed  Google Scholar 

Halder AK, Dias Soeiro Cordeiro MN (2021) QSAR-Co-X: an open source toolkit for multitarget QSAR modelling. J Cheminform 13:29. https://doi.org/10.1186/s13321-021-00508-0

Article  CAS  PubMed  PubMed Central  Google Scholar 

Veríssimo GC (2021) MASSA Algorithm: Molecular data set sampling for training-test separation

Landrum G (2021) RDkit: 2021_03_3 (Q1 2021) Release

Vos NJ de (2015) KModes categorical clustering library

Python Software Foundation argparse—Parser for command-line options, arguments and sub-commands—Python 3.9.7 documentation. https://docs.python.org/3/library/argparse.html. Accessed 5 Oct 2021

scikit-learn: machine learning in Python—scikit-learn 1.0 documentation. https://scikit-learn.org/stable/index.html. Accessed 5 Oct 2021

sklearn.decomposition.PCA. In: scikit-learn. https://www.scikit-learn/stable/modules/generated/sklearn.decomposition.PCA.html. Accessed 5 Oct 2021

scipy.cluster.hierarchy.linkage—SciPy v1.7.1 Manual. https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.linkage.html. Accessed 8 Oct 2021

scipy.cluster.hierarchy.maxdists—SciPy v1.8.0 Manual. https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.maxdists.html. Accessed 22 Mar 2022

scipy.cluster.hierarchy.fcluster—SciPy v1.7.1 Manual. https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.fcluster.html. Accessed 8 Oct 2021

scipy.cluster.hierarchy.dendrogram—SciPy v1.7.1 Manual. https://docs.scipy.org/doc/scipy/reference/generated/scipy.cluster.hierarchy.dendrogram.html. Accessed 8 Oct 2021

sklearn.model_selection.train_test_split. In: scikit-learn. https://www.scikit-learn/stable/modules/generated/sklearn.model_selection.train_test_split.html. Accessed 9 Oct 2021

Sutherland JJ, O’Brien LA, Weaver DF (2004) A Comparison of methods for modeling quantitative structure−activity relationships. J Med Chem 47:5541–5554. https://doi.org/10.1021/jm0497141

Article  CAS  PubMed  Google Scholar 

Liu C-J, Yu S-L, Liu Y-P et al (2016) Synthesis, cytotoxic activity evaluation and HQSAR study of novel isosteviol derivatives as potential anticancer agents. Eur J Med Chem 115:26–40. https://doi.org/10.1016/j.ejmech.2016.03.009

Article  CAS  PubMed  Google Scholar 

Valadares NF, Castilho MS, Polikarpov I, Garratt RC (2007) 2D QSAR studies on thyroid hormone receptor ligands. Bioorg Med Chem 15:4609–4617. https://doi.org/10.1016/j.bmc.2007.04.015

Article  CAS  PubMed  Google Scholar 

Ye M, Dawson MI (2009) Studies of cannabinoid-1 receptor antagonists for the treatment of obesity: hologram QSAR model for biarylpyrazolyl oxadiazole ligands. Bioorg Med Chem Lett 19:3310–3315. https://doi.org/10.1016/j.bmcl.2009.04.072

Article  CAS  PubMed  Google Scholar 

Jiao L, Wang Y, Qu L et al (2020) Hologram QSAR study on the critical micelle concentration of Gemini surfactants. Colloids Surf, A 586:124226. https://doi.org/10.1016/j.colsurfa.2019.124226

Article  CAS  Google Scholar 

Dassault Systèmes Biovia Corp (2020) BIOVIA discovery studio visualizer 2021

Hawkins PCD, Skillman AG, Warren GL et al (2010) Conformer generation with OMEGA: algorithm and validation using high quality structures from the protein databank and Cambridge structural database. J Chem Inf Model 50:572–584.

留言 (0)

沒有登入
gif