The drug loading capacity prediction and cytotoxicity analysis of metal–organic frameworks using stacking algorithms of machine learning

Metal–organic frameworks (MOFs, also known as porous coordination polymers) are an emerging class of porous materials constructed from metal-containing atoms (also known as secondary building units) and organic ligands (Zhou and Kitagawa, 2014). With notable advantages such as large surface area, high porosity, and customizable functionality (Guo et al., 2022), MOFs have garnered increasing interest, leading to their wide-ranging applications across various fields, including biomedicine (Hofmann-Amtenbrink et al., 2015, Horcajada et al., 2012, Wang et al., 2023, Zheng et al., 2021), food processing (Kaur et al., 2018, Ma et al., 2020), cosmetics (Dai et al., 2012, Gaudin et al., 2012), energy storage (Pomerantseva et al., 2019, Wang et al., 2020, Yang et al., 2021), and pollution detection (Wang et al., 2024, Yue et al., 2021), among others. Notably, MOFs have recently emerged as promising drug carriers (Abedi et al., 2021, Li et al., 2022, Lin et al., 2020, Pooresmaeil and Namazi, 2022, Tajahmadi et al., 2023), where their drug loading capacity and biocompatibility play pivotal roles in biomedical applications. Consequently, research on the cytotoxicity of MOFs has also attracted significant attention. It has been shown that the drug loading capacity and cytotoxicity of MOFs are affected by their physicochemical properties, such as topology, composition, particle size, and zeta potential (Tamames-Tabar et al., 2014).

During the recent decade, the exponential growth of experimental data on MOFs has hampered their applications and commercialization (He et al., 2021, Jeong et al., 2024). To screen MOFs with superior performance, machine learning (ML) techniques have been employed as powerful tools. For example, ML was used to predict hydrogen storage and methane uptake in MOFs (Ahmed and Siegel, 2021, Suyetin, 2021). Moreover, the time and economic costs of ML methods are several orders of magnitude lower than those of the previously conventional laboratory approaches. In addition, ML is suitable for dealing with complex problems involving massive combinatorial spaces or nonlinear processes (Butler et al., 2018, Santana et al., 2020). In the field of biomedicine and drug delivery, ML has been successfully employed for the prediction of pharmaceutical formulations (Wang et al., 2021, Yang et al., 2019). In our previous work, we successfully used ML to predict the drug loading capacity of ibuprofen in MOFs (Liu et al., 2022). However, there is a scarcity of studies focusing on the biocompatibility of MOFs. Only a few reviews have summarized that the biocompatibility of MOFs is related not only to their physical properties, including size, shape, and zeta potential, but also to their metal atoms and organic ligands (Ahmadi et al., 2021, Singh et al., 2021). To date, ML has rarely been applied to predict the cytotoxicity of specific MOFs.

In this study, we aimed to develop ML models that can predict the drug loading capacity and cytotoxicity of MOFs and then identify the key physicochemical properties of MOFs that significantly influence the drug loading capacity and/or cytotoxicity. To accomplish this, data on drug loading capacity (161 items) and cytotoxicity (444 items) were obtained from the literature after data screening. The dataset was constructed by incorporating the Morgan fingerprint of the drug molecule and the physicochemical structure properties of MOFs. Stacking algorithms were then employed, which have been proven effective in ML classification and regression problems (Sill et al., 2009, Yu et al., 2022). Moreover, multiple model comparison and feature importance analysis methods were used to evaluate the prediction results of various models, including adaptive boosting regressor (AdaBoost), CatBoost Regressor (CatBoost), random forest (RF), decision tree (DT), K-neighbors regressor (KNR), light gradient-boosting machine (LGBM), Hist gradient boosting (HGB), partial dependence plots (PDPs), individual conditional expectation (ICE), and Shapley additive explanation (SHAP). In addition, the crucial fingerprints of drug molecules were analyzed for their impact on drug loading capacity, and the performance of stacking algorithms was validated by comparing the experimental outcomes with the predicted results of the model. Overall, the application of ML enables a faster and more efficient way of predicting properties and establishing component–property relationships of MOFs. This study also demonstrates the potential of ML in the field of biomaterials, contributing to the accelerated development of biomaterials within the realm of artificial intelligence.

留言 (0)

沒有登入
gif