Enhancing predictive models for egg donation: time to blastocyst hatching and machine learning insights

The incorporation of artificial intelligence (AI) into embryology represents a promising advancement for enhancing assisted reproduction [21]. Despite its current experimental phase [19, 24], the burgeoning interest is palpable in the scientific literature pertaining to IVF laboratories. Multiple algorithms have been introduced and rigorously assessed, addressing various aspects of the IVF cycle [21, 22, 30,31,32].

In this research we have developed easy predictive ML algorithms for prediction of implantation and live birth in egg donation programs, that are ideal as models to perform such kinds of studies, as they have consistent population and clinical outcomes. In fact, as in our study, important selection models have been developed in egg donation cycles.

In our KILBD study, the implantation and live birth rates of the donation cycles were 56.08% and 41.26%, respectively, consistent with previous publications in our egg donation program [26, 33].

Baseline characteristics of our patients were similar between groups with the exception of the age of the recipient, which was significant lower in the group with positive implantation (Table 1). However, no significant differences were found after statistical analysis considering the age of the recipient as a confounding factor. Moreover, attending to the similar donor age in both groups, we could consider that this finding has no impact in the clinical results and could be an occasional finding of our study population.

Given that 53.7% of the embryos analyzed in our study originated from vitrified/warmed oocytes, we aimed to investigate the potential impact of the vitrification/warming process on clinical outcomes. As previously mentioned in the results section, the logistics of our donation program prevented us from including cycles exclusively with either fresh or vitrified oocytes. Our examination, as detailed in Supplementary Table 1, revealed that patients from both vitrified and fresh oocyte groups exhibited comparable baseline characteristics. Furthermore, our analysis of morphokinetic and morphological variables, as well as the calculated time intervals incorporated into our KILBD, demonstrated similar values between fresh and frozen oocytes (Supplementary Table 2), ultimately yielding consistent clinical outcomes (Supplementary Table 3). In a recent publication, Montgomery et al., found a significant delay of 2–3 h across all early cleavage divisions (2- through to 8-cell) and time to start of compaction in the vitrified oocyte group versus fresh oocyte controls [34]. However, they did not find differences in the time of reaching the blastocyst stage or in the derived clinical outcomes [34], in accordance with our results. Murria et al., found that embryo scores provided by AI algorithms were lower for embryos originated from vitrified/warmed oocytes than for those that came from fresh oocytes [35]. However, the potential impact on clinical outcomes remains to be fully assessed pending evaluation with a larger sample size of analysed embryos.

Our findings highlight the significant impact of natural hatching on implantation, with 57.01% of implanted blastocysts initiating hatching compared to 43.37% among non-hatching embryos. The statistical analysis revealed a significant difference between implanted and non-implanted embryos regarding the timing of hatching out of the zona pellucida (TiH, p = 0.039).

The significance of hatching is essential. At the blastocyst stage, the embryo undergoes two crucial processes: hatching and implantation. These processes are essential for initiating post-implantation development and are influenced by cleavage-stage development, ultimately determining pregnancy outcomes. Any defects in blastocyst hatching and implantation can result in early embryo loss and infertility [36]. More than 30% of embryos are estimated to be lost during implantation, with approximately 55% of blastocysts failing to hatch [36]. In fact, in cases of implantation failure during in vitro fertilization procedures, the proportion of unhatched embryos ranged from 50 to 70% [37].

However, previous studies have not placed significant emphasis on hatching observation in ML models. Key predictors in algorithms for predicting blastocyst implantation include time to 5 cells, length of the second cell cycle, and second-to-third division synchrony [2, 38, 39]. Additionally, time to morula and blastocyst formation have shown high predictive value [2, 40]. A review has identified the most critical parameters as the length of the second cell cycle, time to five cells, and second synchrony [5].

We considered including blastocyst diameter and inner cell mass (ICM) size before transfer (at 110 h post-injection) as potential variables, given their presumed relevance. Euploid blastocysts, associated with higher implantation rates, tend to expand earlier than aneuploid ones [41]. Both diameters were measured twice, and the KILBD analysis utilized their mean values. However, these parameters did not exhibit strong predictive capacity in the implantation and live birth models developed. The blastocyst diameter of implanting and live birth blastocysts averaged 158.03 µm and 157.88 µm, respectively, while non-implanting and non-live birth blastocysts averaged 154.67 µm and 155.62 µm, respectively (Supplementary tables 3 and 5). Furthermore, ICM size did not emerge as an influential variable in any model. In contrast, Almagor et al. [42] reported a correlation between blastocyst and ICM diameter and clinical outcomes.

Previous studies, such as that by Bori et al. [13], emphasized the importance of maximizing relevant information to enhance predictive capability. Bori and colleagues proposed novel markers, including pronuclear kinetics, ICM, and blastocyst measurements, for inclusion in AI models to predict implantation in egg donation cycles.

In our study, after data preprocessing to identify key variables, blastocyst diameter was not selected. Surprisingly, subsequent ML models remained highly predictive. This contrasts with the notion that "all characteristics together were more predictive than individually" [13]. However, we agree with Bori et al. on the potential for improved predictive value through the inclusion of patient-related parameters. Recently, H. Liu et al. demonstrated significant advancements in predictive modeling for live birth based on blastocyst evaluation and clinical features, achieving an AUC of 0.77 [20]. In our study, utilizing exclusively kinetic and morphological embryo parameters, our AdaBoost ML model achieved an AUC of 0.749 for live birth prediction, indicating promising performance. This raises the question of integrating additional clinical parameters to potentially surpass an AUC of 0.8, a benchmark that has yet to be reached by any model.

Another compelling aspect to explore would be evaluating the predictive performance of these ML models for implantation and live birth outcomes, and juxtaposing them with the decision-making proficiency of expert embryologists in embryo selection. In this context, Fordham and colleagues found that the AUC for the deep neural network (DNN) in predicting embryo implantation was higher compared to that achieved by embryologists overall (0.70 for DNN vs 0.61 for embryologists) [43].

Some authors suggest that integrating demographic parameters, as proposed by Petersen et al. [39], may pose data acquisition challenges. Conversely, Cai et al. [44] argue that demographic characteristics are inherently implicit within embryo morphokinetics. Furthermore, d´Estaing et al. [45] caution that excessive parameter inclusion during algorithm construction could undermine predictive accuracy.

Kovacic et al. [46] emphasized that various confounding factors, including laboratory conditions and manual parameter annotation, may undermine the reliability of embryo selection algorithms. In fact, one of the limitations of our study is the susceptibility to biases due to the manual annotation of morphokinetic parameters. Although it was conducted by a single trained embryologist at the study coordination center, ideally, annotations should have been performed by two operators, with a third in case of disagreement. While some algorithms advocate for manual annotation, its subjectivity could be mitigated through automation. However, current automated methods face challenges in recognizing direct divisions, cell fusions, or abnormal cell nucleus division [22]. Nonetheless, recent publications have demonstrated the efficacy and reliability of automated annotation [47,48,49].

Another limitation of our study is its retrospective nature. Ideally, algorithms require external validation to assess its robustness and accuracy in different conditions from those where it was developed [31]. For that purpose, setting interfaces that integrate the algorithms, facilitating its use to the embryologists would be highly recommendable [30]. Unfortunately, despite our ongoing work in this area, we currently lack external validation data that would be crucial for generalizing the findings. The execution of a prospective study is also pending. However, it is important to consider the significance of the hatching related variables, which we believe should be incorporated into other algorithms to enhance their predictions.

On the contrary, a key advantage in our study lies in the homogeneity of the egg recipient population and the multicentric approach, which was mitigated by uniform laboratory protocols and procedures, resulting in consistent clinical outcomes across centers. Moreover, the exclusion of low-quality seminal samples was also considered to avoid potential biases they could exert on embryo quality/kinetics.

In our study, we have developed ML algorithms based on morphological and morphokinetic features, but we have made a previous variable selection, with the purpose of getting better predictive results [45]. Four of the eleven implantation ML models had an AUC > 0.70 (Table 3). In general, the kinetic variables related to the blastocyst expansion and hatching processes were the most important variables associated with implantation and also with live birth for AdaBoost algorithm. This led us to regard the observation of the hatching process prior to transfer as a crucial factor in predicting implantation potential, as previously discussed. However, ML models for live birth had a different behavior with less predictive power as shown in Table 4. This could be due to the contribution of unknown maternal clinical features to live birth prediction and endometrium status-related features, such as endometrium preparation, thickness and pattern, that are also critical factors impacting live birth outcomes [20]. It is important to mention that other important variables with high predictive weight for implantation and live birth were those related to syngamy (duration of visible pronuclei, DESAPPN-APPN), embryo cleavage during genomic imprinting (synchronization of cleavage patterns, T8-T5), as well as embryo compaction (duration of compaction, TM-TiCOM; duration of compaction until first sign of cavitation, TiCAV-TM; and time to early compaction, TiCOM) (Figs. 4 and 5). These processes may significantly contribute to achieving successful implantation and live birth. Furthermore, RF has consistently demonstrated superior performance in both implantation and live birth rates. This ML algorithm has been used by other authors demonstrating its high performance [50] even to predict the first trimester miscarriage [51].

However, available scientific evidence that supports the routinely use of these techniques for selecting the best embryo that led to a live birth, is still not enough. In fact, when comparing different algorithms currently available, they result in different conclusions, even when they were trained on the same data, underlining the essential importance of a well-designed mathematical and computational approach [6, 22]. Furthermore, AI models used for the blastocyst selection need expert embryologist supervision to validate the results before performing the embryo transfer and even some authors state that every lab must create its own selection algorithm [52].

留言 (0)

沒有登入
gif