An exploratory multivariate analysis methodology designed for chromatography evaluation was applied on the dataset containing selected descriptors of the separation method. The analysis of such multidimensional structure usually faces challenges from high dimensionality, variables’ interaction, redundancy, and multicollinearity, which obscure the data structure and complicate the isolation of individual variable effects. Conducting a comprehensive visual evaluation of such data is challenging, introduces subjectivity, and fails to reveal hidden dependencies, complicating the rapid identification of critical parameters affecting chromatography.
To process such data, it was first necessary to select an appropriate tool from the R Language portfolio comprising, e.g., correspondence analysis [36], hierarchical clustering (HC) [23], PCA [21], multiple correspondence analysis (MCA) [21], and FAMD [21]. The FAMD was chosen as the most suitable method, combining elements from PCA and MCA. The FAMD offers an advantage over both methods by directly handling mixed data types, enabling the integration of numeric variables and categorical factors into a single dataset. Like PCA and MCA, FAMD is a dimensionality reduction technique that identifies new orthogonal axes (principal dimensions) where the data exhibits the greatest variance. The result is a transformation of the data into a new coordinate system, where the new coordinates are linear combinations of the coordinates representing original variables.
The application of FAMD in LC method development provides valuable insights into the analyzed dataset, as interpreting FAMD outputs can help identify the relative importance of the studied parameters and support targeted optimization of the method under investigation.
1.Correlation circle analysis. Inspecting the correlation circle can reveal which numeric variables are correlated and the nature of their correlations. Vectors pointing in the same direction indicate a strong positive correlation, whereas vectors pointing in opposite directions suggest an inverse relationship. Orthogonal vectors indicate uncorrelated variables. The length of a vector in the correlation circle reflects how strongly these variables contribute to the principal dimensions. This output can, for example, reveal the correlation between the retention times of analyzed compounds and pH.
2.Plot of variable contributions. The variable contributions plot illustrates how dependent and independent variables contribute to the variability of the original dataset. Variables contributing more to major dimensions have greater impact, while those with minimal influence can be considered less significant. For instance, focusing on independent variables such as the additive in the mobile phase, organic solvent, pH, or stationary phase can identify the parameters that contribute most significantly to the variability of dependent variables, such as retention time, resolution, or peak skewness, thus allowing to identify key parameters affecting chromatographic separation. Conversely, in the case of dependent variables (e.g., retention time or peak skewness), a high contribution of a specific compound(s) to dataset variability indicates that this compound is highly influenced by changes in independent parameters. This highlights the conditions under which changes in the dependent variables are feasible due to alterations in independent parameters and where such relationships do not exist. The most important outcome of analyzing this plot is identifying which independent variables have a significant impact on the observed dependent chromatographic parameters. It also helps determine which compounds are most susceptible to these changes.
3.Factorial map. The inspection of the factorial map reveals the spatial distribution of categorical variables within the principal dimensions. The proximity of factors on the factorial map indicates a similar effect on the variability of the dataset, whereas factors located on opposite sides suggest an opposing influence. The position of factors in the new coordinate system provides insight into their contributions to the particular principal component, and color coding further highlights their contributions to dataset variability. By analyzing this plot, it can be determined which combinations of independent chromatographic parameters will have similar or entirely different effects on the chromatographic separation.
4.Plot of individuals. The plot of individuals illustrates clustering of individual observations, highlighting similarities and differences among them. This facilitates the identification of individuals that lead to similar chromatographic outcomes. Additionally, the use of color coding (habillage) enables the examination of the influence of categorical variables like stationary phase, additive in mobile phase, or organic solvent on the clustering. In LC method development, this output can assist in selecting alternatives, such as a different stationary phase, or identifying options for achieving an opposing outcome when modifications to the LC method are required. Based on the color coding (habillage) according to examined factors, it can also support conclusions drawn from other outputs of the FAMD analysis. When factorial map and plot of individuals are combined, the resulting plot directly shows how clusters in a plot of individuals are related to a particular factor in factorial map.
In addition to FAMD, the HC was used to organize individuals into groups based on similarity, using all principal dimensions generated by FAMD, producing a dendrogram visually representing the clustering structure of the data. To evaluate the effectiveness of the proposed method, five UHPLC columns and 16 mobile phase compositions were used for the separation of 15 compounds, resulting in a dataset containing 80 unique combinations of independent chromatographic parameters (individuals) assigned with a unique ID (Suppl. Information, Table S1). When applying FAMD to chromatography data evaluation, it was essential to define the specific objectives of the statistical analysis. Here, the primary objective was to identify key parameters influencing chromatographic separation of the evaluated compounds.
For clarity, the interpretation of FAMD output is demonstrated through a case study, examining the effects of independent chromatographic parameters on retention, resolution (reflected by PkC**), and peak distortion (characterized by peak skewness), and identifying compounds most vulnerable to changes in these parameters. Nevertheless, the presented statistical methods are universal and can be used for various other parameters and their combinations.
Chromatographic retention evaluationThe scree plot (Fig. 1a) shows that the first two principal dimensions account for approximately 60% of the explained data variability, with Dim 1 contributing 47.5% and Dim 2 contributing 10.9%. Due to the considerably lower rate of variability change in higher principal dimensions, only Dim 1 and Dim 2 were selected for further interpretation.
Fig. 1Retention evaluation. a Scree plot, b correlation circle, c variables’ contribution to principal dimension 1 (Dim 1) and d principal dimension 2 (Dim 2). The red dashed line in c and d indicates the expected mean contribution. Abbreviations: mobile phase additive (ADDITIVE), organic solvent (ORGANICS), stationary phase (COLUMN), retention time (RT), quetiapine (Queti), zolpidem (Zolpi), mirtazapine (Mirta), flupentixol (Flupe), trazodone (Trazo), sertraline (Sertra), alprazolam (Alpra), citalopram (Citalo), olanzapine (Olanza), amisulpride (Amisu), fentanyl (Fenta), clonazepam (Clona), methamphetamine (Metha), haloperidol (Halo), diazepam (Dia)
The correlation circle (Fig. 1b) illustrates that vectors align primarily with the positive side of Dim 1, except for those representing clonazepam, alprazolam, diazepam, and pH, demonstrating their high correlation with both principal dimensions. A statistically significant strong positive correlation in retention times, confirmed with Spearman’s test (p < 0.05) (Suppl. Information, Table S3), was observed for clonazepam-diazepam-alprazolam (r > 0.937), trazodone-haloperidol-olanzapine (r > 0.930), and mirtazapine-fentanyl (r > 0.974), indicating closely related retention behaviors of these groups of compounds that respond similarly to variations in mobile phase composition and stationary phase.
Furthermore, retention times show a variable positive correlation with pH, affecting retention differently, with mirtazapine (r = 0.790) and flupentixol (r = 0.413) exhibiting the most and least pH sensitivity, respectively. The orthogonal orientation of the pH vector relative to the vectors of clonazepam, diazepam, and alprazolam (r < 0.063) points to the statistically insignificant effect (p > 0.05) of pH on the retention of these benzodiazepines. The insensitivity of these compounds to pH changes together with their high contribution to the explained variability implies that variables other than pH significantly impact their retention. These findings suggest that pH adjustment is an effective strategy for separation fine-tuning, except for clonazepam, diazepam, and alprazolam. The orthogonality of the PkC** vector with respect to pH (r = 0.204, p > 0.05) implies the minimal impact of pH on overall resolution. Thus, increasing pH generally increases retention; however, it minimally impacts overall resolution, as indicated by PkC** metrics.
The bar plots of variables’ contribution (Fig. 1c, d) illustrate that the variability along Dim 1 mainly originates from variations in retention times of compounds, while Dim 2 is significantly influenced by mobile phase additive, pH, and organic solvent, identifying them as key factors affecting retention characteristics. In contrast, the type of stationary phase shows minimal impact in both principal dimensions, indicating limited effects on retention. The plot further shows that except methamphetamine, all compounds are sensitive to changes in chromatography, according to their high contributions to Dim 1 and Dim 2.
The factorial map (Fig. 2a) shows significant influence of organic solvents and alkaline mobile phases on retention, based on their contribution to the explained variance. The positioning of methanol and acetonitrile at the opposing ends of Dim 1 and Dim 2 reflects their distinct impacts. Similarly, phases containing hydrogen carbonate, located in the lower-right quadrant, display opposite retention characteristics compared to other additives.
Fig. 2Retention evaluation. a Factorial map and b plot of individuals, projected onto the first two principal dimensions (Dim 1, Dim 2). The numerical annotations correspond to the individuals’ ID (Suppl. Information, Table S1). The circles highlight combinations giving the highest (black) and lowest (red) mean retention, based on the first and last deciles for the mean retention of compounds
The positioning of Acquity BEH C18 and Luna OMEGA Polar C18 columns suggests that while they behave similarly, they influence retention differently from the Triart C18, Triart C18 ExRS, and Triart Phenyl.
Overall, the factorial map confirms that retention is more significantly influenced by variations in organic solvents and specific additives than by the stationary phases according to their minimal contribution to the dataset variance.
The plot of individuals (Fig. 2b) reveals the primary formation of two distinct clusters, corresponding to the distribution based on the type of organic solvent used (Suppl. Information, Fig. S1a). According to color-based coding, a secondary trend correlating with the type of mobile phase additive emerges. Combinations using alkaline additives cluster in the lower right, those with ammonium formate and acetate are centrally located, and combinations with acidic additives group in the upper left. The lack of a distinct pattern in the color-based coding according to the stationary phase highlights minimal influence of used stationary phase on retention characteristics (Suppl. Information, Fig. S1b). This supports the findings from other FAMD outputs. A retrospective analysis of the original dataset showed that the highest mean retention (1st decile) is related to individuals with IDs 72, 71, 75, 79, 62, 61, 69, and 65, while the lowest was observed for IDs 14, 4, 13, 20, 10, 44, 3, and 17 (10th decile). Mapping these individuals onto the plot of individuals reveals that combinations of chromatographic parameters leading to similar outcomes are positioned close together in the corresponding regions of the space defined by Dim 1 and Dim 2. Integrating insights from the factorial map, the plot of individuals, and the original dataset shows that mobile phases containing hydrogen carbonate and methanol correlate with higher mean retention times. Conversely, the use of acetonitrile combined with additives such as formic acid, acetic acid, ammonium formate buffer, or acetate buffer is linked to opposite outcomes. Additionally, the impact of the stationary phase on retention is less pronounced compared to other factors.
Peak skewness evaluationThe scree plot (Fig. 3a) shows that the first two principal dimensions explain approximately 33% of the data variability, with Dim 1 accounting for 19.2% and Dim 2 for 13.7%. Following the “elbow criterion,” the third principal dimension, accounting for an additional 8.4% of variance, was included for evaluation in the plot of individuals. The lower variability explained in the first two principal dimensions may be attributed to the less pronounced effect of independent parameters on peak skewness, leading to higher noise in the data and distribution across the higher principal dimensions.
Fig. 3FAMD outputs in peak skewness (SKEW) evaluation. a Scree plot. b Correlation circle. c Variables’ contribution to principal dimension 1 (Dim 1) and d principal dimension 2 (Dim 2). The red dashed line in c and d indicates the expected mean contribution. For abbreviations, refer to Fig. 1
The correlation circle (Fig. 3b) indicates that the variability along Dim 1 is primarily influenced by the peak skewness of quetiapine, zolpidem, mirtazapine, and pH, while Dim 2 is mostly affected by fentanyl and haloperidol. In contrast, diazepam, clonazepam, olanzapine, and methamphetamine contribute the least to both principal dimensions, suggesting that their peak shapes are less impacted by changes in chromatographic parameters. Spearman’s correlation analysis (p < 0.05, Suppl. Information, Table S4) confirmed a strong positive correlation between pH and peak skewness of citalopram (r = 0.640) and a strong negative correlation with quetiapine (r = − 0.750), indicating a significant dependence of their peak skewness on pH. Methamphetamine (r = 0.326), haloperidol (r = 0.427), and amisulpride (r = 0.575) exhibited weak to moderate positive correlations, while flupentixol (r = − 0.224), alprazolam (r = − 0.505), mirtazapine (r = − 0.545), and zolpidem (r = − 0.555) showed weak to moderate negative correlations. Very weak and statistically non-significant (p > 0.05) correlations were observed for trazodone, clonazepam, olanzapine, sertraline, diazepam, and fentanyl (r = − 0.165 to 0.153). In summary, changes in pH significantly affected the peak skewness of citalopram, which increased with higher pH, and quetiapine, which showed the opposite effect. Other compounds, including trazodone, clonazepam, olanzapine, sertraline, diazepam, and fentanyl, were less responsive to pH fluctuations. Interestingly, the peak skewness of fentanyl, despite contributing significantly to the variability of the dataset, did not correlate with pH levels, suggesting that other factors influence its peak characteristics. The bar plots of variables’ contribution (Fig. 3c, d) show that the mobile phase additive and pH are the primary independent drivers of the variability along Dim 1, while the additive alone influences Dim 2. The peak skewness variation in Dim 1 is notably higher for quetiapine, zolpidem, mirtazapine, flupentixol, and trazodone, while fentanyl, haloperidol, citalopram, amisulpride, and sertraline primarily contribute to Dim 2. It indicates that the peak distortion of these compounds is sensitive to changes in stationary and mobile phase composition. The plot showed that the mobile phase additive is the key factor affecting peak skewness, with the type of stationary phase and the organic solvent playing minor roles. Unlike the correlation circle, this plot clarifies factors affecting fentanyl skewness, which primarily stems from the nature of the mobile phase additive, rather than from the pH or stationary phase used.
The factorial map (Fig. 4a) identifies hydrogen carbonate-containing phases and formic acid as key factors affecting peak skewness. On the map, the alkaline additives cluster to the left and the acidic ones to the right, indicating the correlation of Dim 1 with pH, while a pH-independent pattern is evident along Dim 2. The impact of additives such as acetic acid, ammonium acetate, acetate buffer, ammonium formate, or formate buffer on the peak skewness is less pronounced. The use of acetate buffer or the Luna OMEGA Polar C18 stationary phase results in similar effects on skewness. Grouping of acetic acid and formate buffer shows that they exert similar effects as well. The map also exhibits that the stationary phases such as Triart C18, Triart Phenyl, and Acquity BEH C18, along with acetonitrile, lead to similar outcomes, different from that of Triart C18 ExRS and methanol. However, according to the contribution value, the overall influence of stationary phases and solvents is minimal, with the exception of the Luna OMEGA Polar C18, diverging from the other stationary phases.
Fig. 4Evaluation of peak skewness. a Factorial map and b plot of individuals, projected onto the first two principal dimensions. The numerical annotations correspond to the individuals’ ID (Suppl. Information, Table S1). The circles highlight combinations giving optimal (black) and suboptimal (red) results, based on the first and last deciles for the absolute mean peak skewness
The plot of individuals (Fig. 4b) shows a distribution forming cluster on the right, with two tails pointing to the lower left. The plot also shows a pattern according to the mobile phase additive. Incorporating the third principal dimension into the evaluation reveals that alkaline additives form a separate cluster in the 3D space (Suppl. Information, Fig. S3). Color coding by organic solvent or stationary phase (Suppl. Information, Fig. S2a, S2b) reveals no discernible pattern, indicating a random distribution of individuals according to these factors and corroborating their negligible effect on peak skewness. A retrospective analysis of the original dataset showed the lowest mean absolute peak skewness in individuals corresponding to IDs 7, 18, 78, 77, 73, 63, 74, and 13 (1st decile), while the highest correlates with IDs 30, 19, 11, 49, 59, 52, 58, and 12 (10th decile). A summary of the available data indicates that optimal results stem from using hydrogen carbonate or formic acid-containing phases, Triart C18, Triart Phenyl, or Acquity BEH C18 as stationary phases, and acetonitrile as the solvent. Conversely, high peak distortion is very strongly associated with the use of Luna OMEGA Polar C18, acetate buffer, or ammonium formate.
Hierarchical clustering evaluationThe HC dendrogram (Fig. 5) showed that the distance on the plot of individuals (Fig. 4b, Suppl. Information, Fig. S3) may not fully capture the variability across other principal dimensions. Using the full range of principal dimensions from FAMD, the HC provided a more comprehensive representation for similarity assessment and helped to identify combinations of chromatographic parameters that yield the most related results. In investigating retention behavior via HC, the output dendrogram was segmented into six groups (Fig. 5a), each characterized by a unique combination of chromatographic parameters (Suppl. Information, Table S5). The clustering in the dendrogram was correlated with the results obtained by plot of individuals (Fig. 2b).
Fig. 5Dendrograms based on FAMD results to assess similarity among individuals in the investigation of factors influencing a retention behavior and b peak skewness. Individuals giving optimal (black asterisk *) and suboptimal (red dot ) results
For example, the combinations yielding the highest mean retention were present in cluster No. 6, which corresponded to the use of the alkaline mobile phases and methanol as the solvent. Conversely, the lowest mean retention was observed in clusters No. 1 and 2, typically linked to acidic additives, acetonitrile, or the Luna OMEGA Polar C18 column.
The peak skewness investigation revealed that IDs representing optimal or suboptimal conditions are distributed across multiple clusters (Fig. 5b), indicating that a set of more diverse independent parameters can lead to similar outcomes. This finding aligns well with the distribution observed in Fig. 4b, where, in particular, individuals producing lower mean values of peak skewness exhibited a wider distribution across the principal dimensions Dim 1 and Dim 2.
Summary of data analysisThe results suggest that combinations of chromatographic parameters leading to low peak distortion tend to result in reduced retention times and vice versa, indicating that the selection of chromatography parameters necessitates a balance between these opposing outcomes. Since there was no overlap between the two groups, further evaluation was undertaken. It was determined that retention should be prioritized over peak distortion at this stage, as variations in peak skewness were less pronounced in the dataset compared to differences in retention. The data analysis revealed that the use of methanol and hydrogen carbonate buffer was the factor contributing to higher retention. Based on this, the skewness data table was sorted by increasing overall peak skewness, and the first entry utilizing methanol as the solvent and hydrogen carbonate buffer as the additive was selected. This corresponded to record ID 71, which exhibited the second-highest mean retention and was classified within the cluster associated with low overall peak distortion (Figs. 4b and 5b). This combination represents an effective compromise between achieving higher retention and maintaining peak shape integrity.
The K-fold cross-validation showed good stability of the FAMD method as demonstrated by only insignificant changes in the coordinates of individuals (Suppl. Information, Fig. S4) and variables (Suppl. Information, Fig. S5) in Dim 1 and Dim 2. This indicates that the obtained groupings and trends are neither random nor related to the particular structure of the dataset.
Flow rate and temperature optimizationThe initial screening experiment, subjected to FAMD analysis, was conducted using a generic monolinear gradient at a temperature of 40 °C and a flow rate of 0.35 mL/min. Although the combination of chromatographic parameters identified under ID 71 offered sufficient retention of compounds and satisfactory peak shapes, the separation of critical pairs olanzapine-mirtazapine, quetiapine-trazodone, and sertraline-flupentixol was suboptimal. Despite these compounds having different MRM transitions, allowing for their analysis using a mass spectrometer detector, efforts were focused on achieving the best possible separation in the case that a non-selective detector, such as UV–VIS, would be used for analysis. Consequently, the generic gradient was modified to an optimized gradient, which resolved the separation of most compounds except the olanzapine-mirtazapine. To resolve these compounds, the optimization of temperature and flow rate was employed, using BBD optimization method. Contour plot and response surface plot resulting from the optimization are presented in Fig. S6 (Suppl. Information). Based on the results, the optimal separation of the critical pair was achieved at temperature of 60 °C and a flow rate of 0.3 mL/min. The resulting chromatogram is presented in Fig. 6.
Fig. 6Chromatogram corresponding to optimal combination of independent chromatographic parameters (“Samples and solutions” — optimized gradient separation) obtained through FAMD, HC, and BBD analysis. The compounds ID 1: methamphetamine, 2: amisulpride, 3: clonazepam, 4: alprazolam, 5: zolpidem, 6: diazepam, 7: olanzapine, 8: citalopram, 9: mirtazapine, 10: quetiapine, 11: trazadone, 12: haloperidol, 13: fentanyl, 14: flupentixol, and 15: sertraline
留言 (0)