The 2024 Nobel prizes in chemistry and physics highlighted the importance of artificial intelligence and machine learning (ML) techniques in research for the advancement of civilization. While these prizes were given for the foundational discoveries related to ML and neural network and computational protein design and prediction, machine learning is of interest for various fields of research.
In their work, the team collated a dataset from published research of 902 polymers, each with an experimentally determined Tg. To ensure the model’s applicability to a wide range of polymers, the dataset included polymers with repeating units ranging from 2 to >80 atoms (see left image). Likewise, there are various linear regression ML models to pick from for such investigations, so the team decided to test six different linear regression models. Before inputting the data, it was curated — such as to remove the data involving salts and molecular mixtures — and split into test and training sets to validate the model for its investigation of the structure–property relationships within the dataset. An important aspect for ML models is that they need to know what to look for in the data. A set of descriptors was generated with the aim of representing the molecular and electronic structures, as well as topological properties and functional groups. Inputting the glass transition temperatures for the molecules, the ML models are then allowed to seek their own correlations between the descriptors and the glass transition. With each model deciphering the data and making connections to the Tg, the success was quantified through a root mean square error (RMSE) and determination of coefficients (R2, ranging from 0–1) values. Of the models they tested, the support vector machine (SVM) model stood out with the best R2test of 0.77 and R2training of 0.81 (compared other model’s R2 as low as 0.60 and 0.65, respectively). Also, the RMSE for the SVM model has the lowest value of all the tested models. To show what the R2 and RMSE values mean in a physical sense, the team compared the ML-determined Tg values to those obtained from experiments (see right image), which showed a remarkably close correlation.
留言 (0)